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RECOMBINANT METHODS FOR PRODUCTION OF SERINE 
PROTEASE INHIBITORS AND DNA SEQUENCES USEFUL FOR SAME 

This application is a continuation-in-part of United 
States Patent Application Serial No. 678,822, filed December 6, 
1984, 

BACKGROUND OF THE INVENTION 

Endogenous proteolytic enzymes serve to degrade 
invading organisms, antigen-antibody complexes and certain tissue 
proteins which are no longer necessary or useful to the organism* 
In a normally functioning organism, proteolytic enzymes are pro- 
duced in a limited quantity and are regulated in part through the 
synthesis of protease inhibitors. 

A large number of naturally-occurring protease inhib- 
itors serve to control the endogenous proteases by limiting their 
reactions locally and temporally. In addition, the protease in- 
hibitors may inhibit proteases introduced into the body by infec- 
tive and parasitic agents. Tissues that are particularly prone 
to proteolytic attack and infection, e.g., those of the respira- 
tory tract, are rich in protease inhibitors. 

Protease inhibitors comprise approximately 10% of the 
human plasma proteins. At least eight inhibitors have been iso- 
lated from this source and characterized in the literature. 
These include 2 -macroglobulin ( 2 M) , ^protease inhibitor 
( -^PI), 1 -antichymotrypsin ( ^chy) , 1 -anticollagenase 
( -^AC), and inter- -trypsin inhibitor (I I). 

A disturbance of the protease/protease inhibitor bal- 
ance can lead to protease mediated tissue destruction, including 
emphysema, arthritis, glomerulonephritis, periodontitis, muscular 
dystrophy, tumor invasion and various other pathological condi- 
tions. In certain situations, e.g., severe pathological pro- 
cesses such as sepsis or acute leukemia, the amount of free 
proteolytic enzymes present increases due to the release of en- 
zyme from the secretory cells. In addition, or separately in 
other situations, a diminished regulating inhibitor capacity of 
the organism may also cause alterations in the protease/protease 
inhibitor balance. An example of such a diminished regulating 
inhibitor capacity is ^protease inhibitor deficiency, which is 
highly correlated with the development of pulmonary emphysema. 
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In organisms Where such aberrant conditions are pres- 
ent, serious damage to the organism can occur unless measures can 
be taken to control the proteolytic enzymes. Therefore, protease 
inhibitors have been sought which are capable of being adminis- 
tered to an. organism to control the proteolytic enzymes. 

Leukocyte elastase is an example of a serine protease 
of particular interest from a phamacological standpoint. 
Leukocyte elastase, When released extracellularly, degrades con- 
nective tissue and other valuable proteins. While it is neces- 
sary for a normally functioning organism to degrade a certain 
amount of connective tissue and other proteins, the presence of 
an excessive amount of leukocyte elastase has been associated 
with various pathological states, such, as emphysema and 
rheumatoid arthritis. To counteract the effects of leukocyte 
elastase When it is present in amounts greater than normal, a 
protease inhibitor has been sought which is effective against 
leukocyte elastase* Such a protease inhibitor would be espe- 
cially useful if it were capable of being prepared, via a 
recombinant DNA method, in a purified form and in sufficient 
quantities to be pharmaceutical ly useful. 

In the past, at least two leukocyte elastase inhibitors 
have been identified in the literature. One protein, described 
in Schiessler et al . , "Acid-Stable Inhibitors of Granulocyte Neu- 
tral Proteases in Human Mucous Secretions: Biochemistry and Pos- 
sible Biological Function," in Neutral Proteases of Human Poly- 
morphonuclear Leucocytes, Havemann et al . (eds), Urban and 
Schwarzenberg, Inc. (1978), was isolated from human seminal 
plasma and sputum and was characterized as being approximately 
II Kda in size with tyrosine as the U-terminal amino acid. The 
literature reports of this protein have only furnished a partial 
amino acid sequence, but even this partial sequence indicates 
that this protein varies substantially from the proteins of the 
present invention. The reports of the sequence of this protein, 
in combination with amino acid sequence data for proteins of the 
present invention, indicate to the present inventors that the 
product sequenced by Schiessler et al* may have been a degraded 
protein which was not a single-polypeptide chain. 
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A second protein, isolated in one instance from human 
plasma, has been named ^protease inhibitor- Work on this pro- 
tein has been summarized in a review by Travis and Salvesen, 
Annual Review of Biochemistry 52: 655-709 (1983). The reports of 
the amino acid sequence of this protein indicate that it too dif- 
fers substantially from the proteins of the present invention. 

Because of the substantial differences in structure be- 
tween single-polype ptide- chain proteins of the present invention 
and any single-polypeptide-chain serine protease inhibitors of 
the prior art, the single-polypeptide-chain serine protease in- 
hibitors of the prior art are not "substantially homologous" to 
the proteins of the present invention. 

Trypsin is another protease of particular interest from 
a pharmacological standpoint. Trypsin is known to initiate deg- 
radation of certain soft organ tissue, such as pancreatic tissue, 
during a variety of acute conditions, such as pancreatitis. Var- 
ious efforts have been directed toward the treatment of these 
conditions, without marked success, through the use of proteins 
which it was hoped would inhibit the action of trypsin. Illus- 
trative of such efforts are attempts to use exogenous bovine 
trypsin inhibitors in treatment of human pancreatitis. While 
such techniques have been attempted in Europe, they have not been 
approved as effective by the IKS. Food and Drug Administration- 
Thus, there is a need for a protease inhibitor effective in neu- 
tralizing excess trypsin in a variety of acute and chronic condi- 
tions- As was the case with the leukocyte elastase inhibitor 
discussed above, a trypsin inhibitor would be particularly useful 
if it could be isolated and prepared, by recombinant DNA methods, 
in a purified form and in sufficient quantities to be pharmaceu- 
tically useful. 

Cathepsin G is another protease present in large quan- 
tities in leukocytes- Cathepsin G is known to be capable of de- 
grading in vitro a variety of valuable proteins, including those 
of the complement pathway. Pancreatic elastase is another pro- 
tease which may have a role in pancreatitis- Thus, inhibitors 
for these proteases are also of potential pharmaceutical value. 
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Leukocyte elastase, trypsin, cathepsin G and pancreatic 
elastase are examples of a class of proteases known as serine 
proteases , which have elements of common structure and mechanism. 
Their activity against different substrates and their sensitivity 
to different inhibitors are believed to result from changes in 
only a few amino acid residues. By analogy, it is possible to 
conceive of a class of serine protease inhibitors, also having 
common elements of structure and mechanism, in Which changes in a 
relatively few amino acids will result in inhibition of different 
proteases, and. that at least one member of this class will inhib- 
it every serine protease of the former class. The class of 
serine protease inhibitors would then be of substantial value. 

Surprisingly, the present inventors have discovered a 
DNA sequence capable of directing synthesis of such a serine pro- 
tease inhibitor, which inhibitor is biologically equivalent to 
one isolated from parotid secretions. The protease inhibitor of 
the present invention, prepared by the recombinant DNA methods 
set forth herein, is believed to have at least two active sites; 
one site which exhibits leukocyte elastase inhibiting properties 
and a second site which exhibits inhibitory activity against 
trypsin. 

The recombinant inhibitor produced by the present 
invention is believed to be remarkably resistant to denaturation 
by heat and acids and resistant to proteolytic degradation by a 
variety of proteolytic enzymes. As used in this application, it 
is intended that "recombinant inhibitor" refer to a protease in- 
hibitor which is produced by recombinant DNA methodology and 
techniques. Furthermore, the active form of the recombinant in- 
hibitor of the present invention is thermodynamic ally stable 
under conditions that are normally encountered extracellular ly in 
the mammalian body* Denatured forms of the recombinant protease 
inhibitor also have the ability to form the disulfide bonds and 
to form the non-covalent interactions necessary to assume an 
active tertiary structure in the absence of biochemical stimulus . 

The DNA sequences of the present invention, set forth 
more fully hereinbelow, are capable of directing synthesis of a 
protein which differs greatly from other published leukocyte 
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elastase inhibitor sequences. Thus, the identification of the 
DNA sequence of the present invention has made possible the 
invention of recombinant DNA methods of manufacturing the novel 
recombinant protease inhibitors disclosed herein. 

Such recombinant methods will allow manufacture of the 
inhibitors in quantities and purities sufficient to provide eco- 
nomical pharmaceutical compositions Which possess serine protease 
inhibitory activity. Moreover, the identification of the DNA se- 
quence has made possible the invention of recombinant DNA methods 
of manufacturing analogs of the above described serine protease 
inhibitor . 

SUMMARY OF THE INVENTION 

This invention relates to recombinant DNA methods for 
the manufacture of protease inhibitors generally and, more spe- 
cifically, to the manufacture of recombinant inhibitors directed 
to human polymorphonuclear (PMN) -granulocyte proteases. In par- 
ticular, this invention relates to recombinant DNA methods for 
the manufacture of inhibitors for human serine proteases, 
including leukoctye elastase and trypsin. 

Additionally, the present invention relates to 
recombinant DNA methods for the manufacture of analogs of the in- 
stant serine protease inhibitors. The present invention also re- 
lates to synthetic and natural DNA sequences useful in the 
recombinant DNA methods as set forth below. 

It is an object of the present invention to provide a 
method for recombinant DNA synthesis of a serine protease inhib- 
itor, which inhibitor is a single polypeptide chain that exhibits 
serine protease inhibitor activity. These inhibitors possess 
activity which is biologically equivalent to that activity exhib- 
ited by native leukocyte elastase or trypsin inhibitors isolated 
from human parotid secretions. 

To facilitate alternative recombinant DNA syntheses of 
these serine protease inhibitors, it is a further object of this 
invention to provide synthetic DNA sequences capable of directing 
production of these recombinant protease inhibitors, as well as 
equivalent natural DNA sequences. Such natural DNA sequences may 
be isolated from a cDNA or genomic library from which the gene 
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capable of directing synthesis of the protease inhibitor may be 
identified and isolated. 

Moreover, it is an object of the present invention to 
provide recombinant DNA methods for the manufacture of analogs of 
the protease inhibitors discussed above and corresponding analo- 
gous DNA sequences useful in such methods. 

Additional objects and advantages of the invention will 
be set forth in part in the description which follows, and in 
part will be obvious from the description or may be learned from 
practice of the invention. The objects and advantages may be re- 
alized and attained by means of the instrumentalities and combi- 
nations particularly pointed out in the appended claims. 

To achieve the objects and in accordance with the pur- 
poses of the present invention, a DNA sequence has been discov- 
ered which is capable of directing the production, by recombinant 
DNA methodology, of protease inhibitors which, in their active 
forms, are single-polypeptide-chain proteins that exhibit serine 
protease inhibitor activity. These recombinant protease inhib- 
itors are remarkably resistant to denaturation by heat and acids. 
Furthermore, these protease inhibitors retain their biological 
activity even after exposure to many proteolytic enzymes, such as 
chymotrypsin, mouse submaxillary protease and clostripain. 

The coding strand of a DNA sequence which has been dis- 
covered to direct manufacture of these recombinant serine pro- 
tease inhibitors is: 



AGCGG 


TAAAA 


GCTTC 


AAAGC 


TGGCG 


TATGC 


CCGCC 


GAAAA 


AATCC 


GCGCA 


GTGTC 


TGCGG 


TACAA 


AAAAC 


CGGAA 


TGCCA 


GTCCG 


ACTGG 


CAGTG 


CCCGG 


GTAAA 


AAACG 


TTGTT 


GCCCG 


GACAC 


CTGCG 


GCATC 


AAATG 


CCTGG 


ATCCG 


GTTGA 


TACCC 


CGAAC 


CCGAC 


TCGTC 


GAAAA 


CCGGG 


TAAAT 


GCCCG 


GTAAC 


CTATG 


GCCAG 


TGTCT 


GATGC 


TGAAC 


eCGCC 


GAACT 


TCTGC 


GAAAT 


GGACG 


GCCAG 


TGTAA 


ACGAG 


ATCTG 


AAATG 


CTGTA 


TGGGT 


ATGTG 


CGGCA 


AATCT 


TGTGT 


TTCCC 


CGGTA 



AAAGC ATAA 3 ' 

The nucleotides represented by the foregoing abbrevia- 
tions are set forth in the Detailed Description of the Preferred 
Embodiment . 
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The coding strand for a second, preferred DNA sequence 
which has been discovered to direct manufacture of these 
recombinant serine protease inhibitors, particularly a secretory 
leukocyte protease inhibitor (SLPI) of the present invention, is: 



TCTGG 


TAAAA 


GCTTC 


AAAGC 


TGGCG 


TATGC 


CCGCC 


GAAAA 


AATCC 


GCGCA 


GTGTC 


TGCGG 


TACAA 


AAAAC 


CGGAA 


TGCCA 


GTCCG 


ACTGG 


CAGTG 


CCCGG 


GTAAA 


AAACG 


TTGTT 


GCCCG 


GACAC 


CTGCG 


GCATC 


AAATG 


CCTGG 


ATCCG 


GTTGA 


TACCC 


CGAAC 


CCGAC 


TCGTC 


GAAAA 


CCGGG 


TAAAT 


GCCCG 


GTAAC 


CTATG 


GCCAG 


TGTCT 


GATGC 


TGAAC 


CCGCC 


GAACT 


TCTGC 


GAAAT 


GGACG 


GCCAG 


TGTAA 


ACGAG 


ATCTG 


AAATG 


CTGTA 


TGGGT 


ATGTG 


CGGCA 


AATCT 


TGTGT 


TTCCC 


CGGTA 


AAAGC 


ATAA 












To 


further 


achieve 


the objects and 


in accordance 



the purposes of the present invention, a recombinant DNA method 
is disclosed which results in microbial manufacture of the in- 
stant serine protease inhibitors using either the natural or syn- 
thetic DNA sequences referred to above. This recombinant DNA 
method comprises : 

(a) Preparation of a DNA sequence capable of directing 
a host microorganism to produce a protein having serine pro- 
tease inhibitor activity, preferably leukocyte elastase in- 
hibitor activity? 

(b) Cloning the DNA sequence into a vector capable of 
being transferred into and replicating in a host micro- 
organism, such vector containing operational elements for 
the DNA sequence? 

(c) Transferring the vector containing the DNA se- 
quence and operational elements into a host microorganism 
capable of expressing the protease inhibiting protein? 

(d) Culturing the microorganism under conditions 
appropriate for amplification of the vector and expression 
of the inhibitor? 

(e) Harvesting the inhibitor? and 

( f ) Permitting the inhibitor to assume an active ter- 
tiary structure whereby it possesses serine protease 
inhibitor activity. 
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To facilitate identification and isolation of natural 
DNA sequences for use in the present invention, the present 
inventors have developed a human parotid tissue cDNA library. 
This library contains the genetic information capable of 
directing a cell to synthesize the serine protease inhibitors of 
the present invention. Other natural DNA sequences which may be 
used in the recombinant DNA methods set forth herein may be iso- 
lated from a human genomic library. 

The synthetic DNA sequences useful in the processes of 
the present invention may be prepared by polynucleotide synthesis 
and sequencing techniques known to those of ordinary skill in the 
art. The natural DNA sequences useful in the foregoing process 
may be identified and isolated through a method comprising: 

(a) Preparation of a human cDNA library from cells, 
preferably parotid cells t capable of generating a serine 
protease inhibitor; 

(b) Probing the human DNA library with at least one 
probe capable of binding to the protease inhibitor gene or 
its protein product; 

(c) Identification of at least one clone containing 
the gene coding for the inhibitor by virtue of the ability 
of the clone to bind at least one probe for the gene or its 
protein product; 

(d) Isolation of the gene coding for the inhibitor 
from the clone(s) identified; and 

(e) Linking the gene, or suitable fragments thereof, 
to operational elements necessary to maintain and express 
the gene in a host microorganism* 

The natural DNA sequences useful in the foregoing pro- 
cess may also be identified and isolated through a method com- 
prising : 

(a) Preparation of a human genomic DNA library, pref- 
erably propagated in a recArecB C E. coli host; 

(b) Probing the human genomic DNA library with at 
least one probe capable of binding to a serine protein in- 
hibitor gene or its protein product; 



SUBSTITUTE SHEET 



WO 86/03519 



PCT/US85/02385 



-9- 

(c) Identification of at least one clone containing 
the gene coding for the inhibitor by virtue of the ability 
of the clone to bind at least one probe for the gene or its 
protein product? 

(d) Isolation of the gene coding for the inhibitor 
from the clone or clones identified; and 

(e) Linking the gene, or suitable fragments thereof, 
to operational elements necessary to maintain and express 
the gene in a host microorganism* 

Moreover, to achieve the objects and in accordance with 
the purposes of the present invention, pharmaceutical ly useful 
analogs of the serine protease inhibitor may be produced by the 
above-recited recombinant DNA method by altering the synthetic 
DNA sequence or the natural DNA segment, through recombinant DNA 
techniques, to create a gene capable of inducing expression of 
the desired analog when cloned into an appropriate vector and 
transferred into an appropriate host microorganism. 

Additionally, to achieve the objects and in accordance 
with the purposes of the present invention, pharmaceutical compo- 
sitions containing, as an active ingredient, a recombinant pro- 
tease inhibitor in accordance with the present invention, or its 
biologically active analog produced by the above- recited 
recombinant DNA methods, are disclosed- 

The accompanying drawings, which are incorporated here- 
in and constitute a part of this application, illustrate various 
plasmids useful in this invention and, together with the descrip- 
tion, serve to explain the principles of the invention. 
BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a map of plasmid pSGE6. 

Figure 2 is a map of plasmid pSGE8. 

Figure 3 is a map of plasmid pGS285. 

Figure 4 is a map of plasmid pGS485. 
DESCRIPTION OF THE PREFERRED EMBODIENTS 

Reference will now be made in detail to the presently 
preferred embodiments of the invention, which, together with the 
following example, serve to explain the principles of : the inven- 
tion* 
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As noted above, the present invention relates to pro- 
tease inhibitors which have been isolated in a purified form. 
Preferably, the serine protease inhibitors of the present inven- 
tion are single-polypeptide-chain proteins which are substantial- 
ly homologous to, and most preferably biologically equivalent to, 
native serine protease inhibitors isolated from human parotid se- 
cretions. By "biologically equivalent," as used throughout the 
specification and claims, it is meant that the compositions are 
capable of preventing protease induced tissue damage of the same 
type, but not necessarily to the same degree, as the native pro- 
tease inhibitor. By "substantially homologous , " as used 
throughout the ensuing specification and claims, is meant a de- 
gree of homology to the native parotid inhibitor in excess of 
that displayed by previously reported single-polypeptide-chain 
serine protease inhibitor proteins. Preferably, the degree of 
homology is in excess of 40%, most preferably in excess of 50%, 
with a particularly preferred group of proteins being in excess 
of 60% homologous with the native parotid inhibitor. The per- 
centage homology as above described is calculated as the percent- 
age of the components found in the smaller of the two sequences 
that may also be found in the. larger of the two sequences, a com- 
ponent being understood as a sequence of four, contiguous amino, 
acids. 

Preferred protease inhibitors produced by the present 
recombinant methods are described in United States Patent Appli- 
cation Serial No. 678,823 of Robert C. Thompson et al. entitled 
"Serine Protease Inhibitors and Methods for Isolation of Same," 
filed December 6, 1984 and United States Patent Application Seri- 
al No. of Robert C. Thompson et . al . entitled "Serine 

Protease Inhibitors and Methods for Isolation of Same," filed of 
even date herewith. Such protease inhibitors are remarkably re- 
sistant to denaturation by heat and acids and resistant to loss 
of activity when exposed to many proteolytic enzymes, including 
chymotrypsin, mouse submaxillary protease and clostripain. These 
inhibitors also have the ability to form the necessary disulfide 
bonds and undergo appropriate non-covalent interactions to. assume 
an active tertiary structure capable of expressing serine 
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protease inhibitor activity in the absence of a biochemical stim- 
ulus or, if the disulfide bonds have been broken and the non- 
covalent interactions have been disrupted, to re-form such bonds 
and interactions to regain such active tertiary structure in the 
absence of biochemical stimulus. 

A preferred serine protease inhibitor having these 
characteristics has been sequenced. The sequence was determined 
to be as follows : 

Ser-Gly-Lys-Ser-Phe-Lys-Ala-Gly-Val-Cys-Pro~ 
Pro-Lys-Lys-Ser-Ala-Gln-Cys-Leu-Arg-Tyr-Lys- 
Lys-Pro-Glu-Cys-Gln-Ser-Asp-Trp-Gln-Cys-Pro- 
Gly-Lys-Lys-Axg-Cys-Cys-Pro-Asp-Thr-Cys-Gly- 
Ile-Lys-Cys-Leu-Asp-Pro-Val-Asp-Thr-Pro-Asn- 
Pro-Thr-Arg-Arg-Lys-Pro-Gly-Lys-Cys-Pro-Val- 
Thr-Tyr-Gly-Gln-Cys-Leu-Met-Leu-Asn-Pro-Pro- 
Asn-Phe-Cys-Glu-Met-Asp-Gly-Gln-Cys-Lys-Arg- 
Asp-Leu-Lys-Cys-Cys-Met-Gly-Met-Cys-Gly-Lys- 
S er-Cys-Val-S er-Pro-Val-Ly s-Ala . 

The foregoing abbreviations correspond to the amino 
acid residues in the polypeptide as follows: 



Amino acid 


Abbreviation 


Alanine 


Ala 


Valine 


Val 


Leucine 


Leu 


Isoleucine 


He 


Proline 


Pro 


Phenyl al anine 


Phe 


Tryptophan 


Trp 


Methionine 


Met 


Glycine 


Gly 


Serine 


Ser 


Threonine 


Thr 


Cysteine 


Cys 


Tyrosine 


Tyr 


Asparagine 


Asn 


Glut amine 


Gin 


Aspartic acid 


Asp 
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Glutamic acid Glu 
Lysine Lys 
Arginine Arg 
Histidine His 

It has been found that these protease inhibitors manu- 
factured by the recombinant DNA methods disclosed herein have 
more than one distinct domain. By more than one distinct domain 
it is meant. that the protein has multiple active sites Which are 
functional against different enzymes. The presence and location 
of these sites have been determined by the discovery of a sub- 
stantial homology between at least two portions of the protease 
inhibitor. It is believed that the presence of distinct domains 
confers on the instant protease inhibitors the ability to inhibit 
a wide variety of serine proteases that includes both leukocyte 
elastase and trypsin. 

It has further been noted that, due to the plurality of 
distinct domains of these protease inhibitors, the protease in- 
hibitors may serve as frameworks on which various other active 
sites may be constructed to create protease inhibitors having 
additional properties. The preferred embodiment of the present 
invention involves production of a protease inhibitor that inhib- 
its leukocyte elastase, cathepsin G, pancreatic elastase and 
trypsin. These enzymes are all members of a class of proteases 
known as serine proteases that share a common mechanism and many 
structural features. It is believed that, through manipulation 
of a few amino acid side- chains on the protease inhibitors pro- 
duced by the present invention, a multiplicity of inhibitors may 
be created, each being capable of inhibiting at least one member 
of the whole class of serine proteases. Furthermore, such side- 
chain modifications can be expected to yield a plurality of in- 
hibitors having improved inhibitory properties with respect to 
particular members of the class of serine proteases described 
above. 

The amino acid-side chain changes required to. attain 
these goals are suggested by certain elements of structural simi- 
larity between the preferred inhibitor produced by the present 
invention and other serine protease inhibitors for which the 
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important functional part of the inhibitor has been elucidated 
through X-ray crystallography. Those elements of structural sim- 
ilarity incude amino acids 17 to 29 and amino acids 70 to 83 of 
the preferred serine protease inhibitor produced by the present 
invention described above. The changes suggested to improve the 
inhibitor's activity, either in terms of quantity or quality, to- 
ward trypsin- like serine proteases include changing one or more 
of amino acid 20 from Arg to Lys, amino acid 72 or 74 from Leu to 
Lys or Arg, and amino acid 73 from Met to Lys or Arg. 

The changes suggested to improve the inhibitor's activ- 
ity, either in terms of quantity or quality, toward chymotryps in- 
like serine proteases, including cathepsin G, include changing 
one or more of amino acid 20 from Arg to Phe, Tyr or Trp, amino 
acid 72 or 74 from Leu to Phe, Tyr or Trp, and amino acid 73 from 
Met to Phe, Tyr, or Trp. 

The changes suggested to improve the inhibitor* s activ- 
ity, either in terms of quantity or quality, toward pancreatic- 
elastase-like serine proteases include changing one or more of 
amino acid 20 from Arg to Ala, amino acid 72 or 74 from Leu to 
Ala, and amino acid 73 from Met to Ala. 

It must be borne in mind in the practice of the present 
invention that the alteration of amino acid sequences to confer 
new protease inhibiting properties on the present proteins may 
disrupt the inhibitor's activity toward leukocyte elastase or to- 
ward trypsin. Such effects may be determined by routine experi- 
mentation following the teachings of the present invention. 

Further, it is contemplated that substitution of dis- 
crete amino acids or of discrete sequences of amino acids, as set 
forth above, may enhance either the leukocyte elastase inhibitory 
properties or the trypsin inhibitory properties of the present 
protease inhibitors while sacrificing some activity of the unen- 
hanced domain. Indeed, the activity of any domain within the in- 
hibitor protein may be eliminated entirely by appropriate amino- 
acid substitutions, thereby creating inhibitor proteins which are 
specific for one or some subset of the enzymes against which the 
protein is normally active. For example, substitution of Gly for 
Arg in position 20 deactivates the trypsin inhibitory domain 
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while substitution of Gly for Met in the 73 position or for Leu 
in the 72 or 74 position deactivates the leukocyte elastase in- 
hibitory domain* The domains may also be separated into separat 
proteins, each of which retains the desired inhibitory functions 
The present claims extend to other processes for producing such 
inhibitors by these means. 

The present inventors have discovered a synthetic DNA 
sequence which is capable of directing intracellular production 
of the above-discussed protease inhibitors. This sequence has 
the following structure: 
Hindlll 

5*AGC GGT AAA AGC TT C AAA GCT GGC GTA TGC CCG CCG 

Alul 

FnuDII Rsal Hpall 

AAA AAA TCC GCG CAG TGT CTG CGG TAC AAA AAA CCG 
Hhal 

Xmal 



GAA TGC CAG TCC GAC TGG CAG TGC CCG GG T AAA AAA 

Hpal l 



Neil 

Neil 



Hpall Fnu4HI BstNI 



CGT TGT TGC CCG GAC ACC TGC GGC ATC AAA TGC CTG 
BamHI 

GAT CCG GT T GAT ACC CCG AAC CCG ACT C GT CGA AAA 
Hpall Tag I 

Neil Hpall Ball 

CCG GGT AAA T GC CCG GT A ACC TAT GGC CA G TGT CTG 
Hpall Neil Haelll 

ATG CTG AAC CCG CCG AAC TTC TGC GAA ATG GAC GGC 

Haelll 

Bglll 



CAG TGT AAA CGA GAT CT G AAA TGC TGT ATG GGT ATG 

Mbol 

Fnu4HI Neil 



TGC GGC AAA TCT TGT GTT TCC CCG GT A AAA GCA TAA 3 1 

Hpall 

wherein the following nucleotides are represented by the abbrevi- 
ations indicated below. 
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Nucleotide Abbreviation 

deoxyadenylic acid ^ 

deoxyguanylic acid q 

deoxycytidylic acid c 

thymidylic acid >p 

The present inventors have discovered a second, pre- 
ferred synthetic DNA sequence which is capable of directing ex- 
tracellular production of the above-discussed protease inhib- 
itors, particularly the secretory leukocyte protease inhibitor 
(SLPI) referred to above. This sequence has the following struc- 
ture: 



Hindlll 

5 TCT GGT AAA AGO TTC AAA OCT GGC GTA TGC CCG CCG 



Aim 

FnuDII 



AAA AAA TCc"gCQ CAG TGT CTG CGG^ AAA AAA ll^ 

xinax 



GAA TGC CAG TCC GAC TGG CAG T GC GG T AAA AAA 

Hpall 



Neil 

Hpall 

CGT TGT TGC_CCG_GAC ACC TGC GGC ATC AAA TGC CTG 
NciI Fnu4HI BstNI 

BamHI 

~ AT TfiSlf* ^ ° CG CCG ACT C GT CGA AAA 

TaqI 

Hpall Hpall BalI 

CCGGGT AAA TGC CCG GTA ACC TAT GGC CA G TGT CTG 
C1J - Nci1 Haelll 

ATG CTG AAC CCG CCG AAC TTC TGC GAA ATG GAC GGC 

Haelll 

Bqlll 

CAG TGT AAA CGA GAT CT G AAA TGC TGT ATG GGT ATG 

Mbol 

Fn "4H* NciI 
TGC GGC AAA TCT TGT GTT TC C CCG GT A AAA GCA TAA 3 ■ 

Hpall 
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Du to multiple domain structure of the instant pro- 
tease inhibitors, as noted above, variations are contemplated in 
the synthetic DNA sequence set forth herein which will result in 
a DNA sequence which is capable of directing production of the 
serine protease inhibitor analogs as discussed above. In partic- 
ular, preferred analogs of the serine protease inhibitors manu- 
factured by recombinant DNA techniques according to the present 
invention have the amino acid sequence: 

Rj^-Gly-Lys-Ser-Phe-Lys-Ala-Gly-Val-Cys-Pro- 

Pro-Lys-Lys-Ser-Ala-Gln-Cys-Leu-Rj-Tyr-Lys- 

Ly s-Pr o-Gl u-Cy s -G In-S er- Asp-Tr p-G In-Cy s -Pro- 

Gly-Lys-Lys-Arg-Cys-Cys-Pro-Asp-Thr-Cys-Gly- 

Ile-Lys-Cys-Leu-Asp-Pro-Val-Asp-Thr-Pro-Asn- 

Pro-Thr-Arg-Arg-Lys-Pro-Gly-Lys-Gys-Pro-Val- 

Thr-Tyr-Gly-Gln-Cys-R 8 -R 3 -R 9 -Asn-Pro-Pro- 

Asn-Phe-Cys-Glu-R^-Asp-Gly-Gln-Cys-Lys-Arg- 

Asp-Leu-Lys-Cys-Cys-Rg-Gly-R^-Cys-Gly-Lys- 

S er-Cy s-Val-S er-Pro-Val-Lys-R 7 , 

wherein, 

R^ and Rj are the same or different and are selected 
from the group consisting of a substituted or unsubstituted amino 
acid residue or derviative thereof; and 

R 2 / R3, R^, Rg, Rg, Rg and R g are the same or different 
and are selected from the group consisting of methionine, valine, 
alanine, phenylalanine, tyrosine, tryptophan, lysine, glycine and 
arginine • 

It should be noted that the DNA sequence set forth 
above represents a preferred embodiment of the present invention. 
Due. to the degeneracy of the genetic code, it is to be understood 
that numerous choices of nucleotides may be made which will lead 
to a DNA sequence capable of directing production of the instant 
protease inhibitors or their analogs. As such, DNA sequences 
which are functionally equivalent to the sequence set forth above 
or which are functionally equivalent to sequences which would 
direct production of analogs of the protease inhibitor produced 
pursuant to the amino acid sequence set forth above, are: intended 
to be encompassed within the present invention. As an example of 
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the codon substitutions that are contemplated as a result of the 
degenerate genetic code, the following diagram represents addi- 
tional DNA sequences which are intended to be included within the 
scope of the present invention for manufacture of the preferred 
amino acid sequence enumerated above. By following the example 
for determining equivalent DNA sequences for production of this 
protein, those of ordinary skill in the art will be able to de- 
termine equivalent DNA sequences for production of analogs of the 
preferred amino acid sequence as well. 

10 

Ser Gly Lys Ser Phe Lys Ala Gly Val Cys Pro Pro Lys Lys Ser Ala 
5 'TCN GGN AAP TCN TTQ AAP GCN GGN GTN TGQ CCN CCN AAP AAP TCN GCN 
AGQ AGQ AGQ 

20 30 
Gin Cys Leu Arg Tyr Lys Lys Pro Glu Cys Gin Ser Asp Trp Gin Cys 
CAP TGQ CTN CGN TAQ AAP AAP CCN GAP TGQ CAP TCN GAQ TGG CAP TGQ 
TTP AGP AGQ 

40 

Pro Gly Lys Lys Arg Cys Cys Pro Asp Thr Cys Gly lie Lys Cys Leu 
CCN GGN AAP AAP CGN TGQ TGQ CCN GAQ ACN TGQ GGN ATQ AAP TGQ CTN 
AGP ATA TTP 

50 60 
Asp Pro Val Asp Thr Pro Asn Pro Thr Arg Arg Lys Pro Gly Lys Cys 
GAQ CCN GTN GAQ ACN CCN AAQ CCN ACN CGN CGN AAP CCN GGN AAP TGQ 

AGP AGP 

70 80 



Pro 


Val 


Thr 


Tyr 


Gly 


Gin 


Cys 


Leu 


Met 


Leu 


Asn 


Pro 


Pro 


Asn 


Phe 


CCN 


GTN 


ACN 


TAQ 


GGN 


CAP 


TGQ 


CTN 


ATG 


CTN 


AAQ 


CCN 


CCN 


AAQ 


TTQ 
















TTP 




TTP 






























90 












Glu 


Met 


Asp 


Gly 


Gin 


Cys 


Lys 


Arg 


Asp 


Leu 


Lys 


Cys 


Cys 


Met 


Gly 


GAP 


ATG 


GAQ 


GGN 


CAP 


TGQ 


AAP 


CGN 


GAQ 


CTN 


AAP 


TGQ 


TGQ 


ATG 


GGN 
















AGP 




TTP 
































100 










Cys 


Gly 


Lys 


Ser 


Cys 


Val 


Ser 


Pro 


Val 


Lys 


Ala 










TGQ 


GGN 


AAP 


TCN 


TGQ 


GTN 


TCN 


CCN 


GTN 


AAP 


GCN 


3' 









AGQ AGQ 
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in the above sequence, abbreviations used are intended to repre- 
sent the nucleotides indicated below. 

Nucleotide Abbreviation 

A,G,C,T N 
A,G P 
C,T Q 

When selecting codons for use in the synthetic DNA se- 
quences of the present invention, including that set forth imme- 
diately above, it is preferred that the codons used to indicate a 
particular amino acid be those which are associated with highly 
expressed proteins. Examples of these preferred codons are set 
forth in part in Grantham, R. et al . , "Codon Catalog Usage Is a 
Genome Strategy Modulated For Gene Expressivity" in Nucleic Acids 
Research £sr43 (1981). The preferred DNA sequence of the present 
invention was chosen by selecting Escherichia coli sequence 
codons for any of the degenerate sequences. 

Additionally, it is desired to select codons which fa- 
cilitate the alteration of the synthetic DNA sequence to con- 
struct additional synthetic DNA. sequences which are capable of 
directing production of analogs of the present protease inhib- 
itors. In particular, it is preferred that nucleotide sequences 
are selected which, if possible, create restriction endonuclease 
sites at, or close to, positions in the synthetic DNA sequence 
into which it may be desired to insert additional codons or at 
which sites it may be desired to replace a codon so that analogs 
may be created. In the preferred embodiment of the DNA sequence 
of the present invention, the restriction sites are indicated 
below the nucleotide sequence set forth above. 

Methods of creating the synthetic DNA sequences contem- 
plated herein are generally within the ambit of routine taslcs 
performed by one of ordinary skill in the art guided by instant 
disclosure. An example of a suitable method which may be used to 
obtain the synthetic DNA sequence disclosed herein is set forth 
in Matteacci, M.D. and Caruthers, M.H., J.Am.Chem.Soc- 103 :3185 
(1981) and Beaucage, S.L. and Caruthers, M.H., Tetrahedron Lett. 
22:1859 (1981), bpth of which are specifically incorporated here- 
in by reference. 
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In an alternate embodiment of the present invention, a 

DNA sequence has been isolated from a human genomic library which 

encodes a preferred secretory leukocyte protease inhibitor (SLPI) 

of the present invention. This sequence, encoding from the 

fourth codon and including the introns as presently known to the 

inventors, is as follows: 

EcoR l 20 .40 .60 
GAATTCTGGT GGGGC CACACCCACTGGTGAAAGAATAAATAGTGAGGTTTGGATTGGGC 
INTRON ■ > 



80 . 100 . 120 

ATCAGAGTCACTCCTGCCTTCACCATGAAGTCCAGCGGCCTCTTCCCCTTCCTGGTGCTG 



140 . 160 . 180 

CTTGCCCTGGAACTCTGGCACTTGGGCTTGGAAGGCTCTGAAATGTAAGTTGGAGTCACT 



Pstl . 220 • 240 

CTGTCT AATCTGGGCTGC AGGGTCAGAGGT GGGGTCTCCTT GT GGTGT GGGT GT GTCCCC 



260 . 280 . 300 

TTCTGTAGGCTCTGATCCCTCAGCTTAGTTTCGGGAGACCTCCCTGAGGGTGGAATACAT 



Sad 320 * . 340 . 360 

GTCTGGCTGAGCTCCAAGGTTTGTGTGACAGTTTGAGCTTCTGGAAATGCTTCCTCTATG 



380 ♦ 400 . 420 

CAGCCATGCTGTCAGCCCAGGTCCCACTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCTCA 



440 . 460 . 480 

TACTCCGCCTTCTTCTTCACCTTGCTGCGACTCTCAAATCATTAGTTTCTGACTCTGCTT 



500 . 520 . 540 

CCGTTGTGTCTTTGCTTCTGCTATTTTGTCTCTGTGCTTCTCGCTTGGGATTTAGCTCTC 



560 . 580 . 600 

AACTTCTCTCACACTGGTTCTATTTATCTTTGTTTACCTCTCTCCATCTCCATCACTCCC 



620 ♦ 640 • 660 

AGCCTTCCTCTCTGCCTTTGTGTAGCCTTGTTTTGCTCTTGGGTGGAGGTCTTGACTAGA 
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680 * 700 . 720 

AGCCTGCTGCCCTTTTCTTGGGTGTGAAACGTCCCCTGTCCATTTGTCTAATTTAATCAA 



740 * 760 . 780 

GCCCATCAATACACCTGGAGATCAGGCAGGCATGACCTTTGGGCTTTGTGGACAGCTACT 



800 * 820 . 840 

GAGGTAAGGGTCTCTCCCCCTCAAAAGTGGTGCTTTGTTCAGGAGGCATGATGGGTCCTC 



860 . 880 . 900 
AGTACCCAGCCTCCTCCTACCTCTTGACTTTCTCTTCAAAAGCCTTCAAAGCTGGACTCT 
END OF INTRON F K A G V 

920 • 940 . 960 

GTCCTCCTAAGAAATCTGCCCAGTGCCTTAGATACAAGAAACCTGAGTGCCAGAGTGACT 
CPPKKSAQ CLRYKKPECQ S D 

980 . 1000 . BamHl 

GGCAGTGTCCAGGGAAGAAGAGATGTTGTCCTGACAGTTGTGGCATCAAATGCCTGGATC 
WQCPGKKR CCPDTCGIKCLD 

1040 . 1060 . 1080 
CTGTTGACACCCCAAACCCAAGTAAGCAGGTCGGGGAACTGGGTAGAGAGATAGCCTGGG 
P V D T P N P — START INTRON 

1100 . 1120 StuI * 1140 

GACACAGCATTAGAGGGACGGAACTGGGTGATGGGTCCTGCCAGGCCTCCTTGTCAATGC 



1160 . PvuII • 1200 

C GTAGT GAGTC ACAGT GC CCTAAGAGAAGT AGCCAGCTGGT GAAGCAGC GGGC ATTT AGA 



1220 • 1240 . 1260 

TAGCCAGGTAGTTGGAAGCCTCCCACCTAGTCAGCACTGGGCGGCTGGCACCTGCATAAT 



1280 . 1300 . 1320 

GGGGGGCCTGAAGTTCTAGGAGAGCCAGGTGCTATGTTTGGGGGCCGCCTTAGGGAGAAG 



1340 • 1360 . 1380 

GTGGTGGTGATAGAGGTGGGGAGGGGATGATCCCCCCTGCTGAAGCTGGACGAGGGGCTC 



1400 • * 1420 StuI . 1440 
ACTCTAAAAAGTGGGGATGGGAGGGGTTGTATAAAGTACAAGGCCTCTGACCGGTAGCCT 
END 0F INTRON 

1460 . 1480 . 1500 
CACTCTCACCCAACCCAGCAAGGAGGAAGCCTGGGAAGTGCCCAGTGACTTATGGCCAAT 
R R K P G K C P V T Y G Q 
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1520 . 1540 . 1560 

GTTTGATGCTTAACCCCCCCAATTTCTGTGAGATGGATGGCCAGTGCAAGCGTGACTTGA 
CLMLNPPtfFCEMDGQCKRDL 

1580 • 1600 . 1620 

AGTGTTGCATGGGCATGTGTGGGAAATCCTGCGTTTCCCCTGTGAAAGGTAAGCAGGGGA 
KCCMGMCGKSCVSPVK —START INTRON 

. SacI 1640 . 1660 ♦ 1680 

CGAGGGCACACTGAGCTCCCTCAGCCCTCTCAGCCTCAACCCTCTGGAGGCCCAGGCATA 



1700 . 1720 . 1740 

TGGGCAGGGGGACTCCTGAACCCTACTCCAAGCACAGCCTCTGTCTGACTCCCTTGTCCT 



1760 * 1780 • 1800 

TCAAGAGAACTGTTCTCCAGGTCTCAGGGCCAGGATTTCCATAGGAGTCGCCTGTGGCTT 



1820 . 1840 . 1860 

T GATTCTATTCTAGT GTCTCTGGGTGGGGGTCCTGGGCAAGT GTCTTTCTGAGT CTAGTT 



1880 . 1900 . 1920 

TCTTTATCGGTAAAATGTACATAATGAGATGAAAGTGCTCTGCAAAGACCTATGTGCACT 



1940 • 1960 . 1980 

AAGAATT ATT ATTCAGGT GTTTCCATC ATGTTTTCTGAGGT GAAATC ACAAAGGATC AGT 



2000 . 2020 . 2040 

G GAGTTT GAGGATTATCTAGTTCAATGCTTTGAGTTT AGAGTTTT ACGT GAAAATGAGAC 



2060 . 2080 . 2100 

TTGTCTCCTGACACTAAGTCTCTCTCAACTATAGCGCTATCTTGCTATTTTCTCTATCTC 



2120 . 2140 • 2160 

AGAAGGATCCTTGGGCAGGAGGAAGGATGTGGATATATGATTTGGCTGGTTTCTATGCTG 



2180 ♦ 2200 . 2220 

AAGCTCTGATCTGATTTTCTCTCACAGCTTGATTCCTGCCATATCGGAGGAGGCTCTGGA 

-PROBABLE STOP 

END OF INTRON 

2240 . 2260 

GCCTGCTCTGTGTGGTCCAGGTCCTTTCCACCCTGAGCTTGGCTCCACCACTGGT 

In this sequence, the abbreviations used for the amino acid resi- 
dues are the one- letter abbreviations which are commonly employed 
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and may be found, for example, in Biochemistry by A.L. Lehninger, 
2nd ed., Worth Publishers, Inc., New York, New York (1976) # pg. 
72. 

Using this sequence and the amino acid sequence data 
contained herein, a synthetic DNA sequence can be constructed, 
that, when added to the genomic sequence above, leads to a gene 
which codes for the entire protease inhibitor. Alternatively, 
probes may be constructed using the DNA sequence set forth above 
and used to retrieve a DNA segment from a human genomic library 
which has codons for the first three amino acids. 

Additionally, such probes may be used to identify a 
human genomic sequence which contains an appropriate leader se- 
quence. It is contemplated that this leader sequence, or any 
other appropriate leader sequence, could be used in conjunction 
with this genomic DNA sequence in a mammalian expression system. 

In another alternate embodiment of the present inven- 
tion, a cDNA clone has been isolated from a parotid library which 
encodes a DNA sequence capable of directing intracellular produc- 
tion of a preferred secretory leukocyte protease inhibitor of the 
present invention. This clone is included on on de- 

posit at American Type Culture Collection, Rockville, Maryland, 
under Accession No. . 

A recombinant DNA method for the manufacture of a pro- 
tease inhibitor composed of a single polypeptide chain with at 
least one active site possessing serine protease inhibitor activ- 
ity has been disclosed. In one embodiment of the invention, the 
active site functions in a manner biologically equivalent to that 
of the native leukocyte elastase inhibitor isolated from human 
parotid secretions. A natural or synthetic DNA sequence may be 
used to direct production of the protease inhibitors. This meth- 
od comprises: 

(a) Preparation of a DNA sequence capable of directing 
a host microorganism to produce a protein having serine pro- 
tease inhibitor activity; 

(b) Cloning the DNA sequence into a vector capable of 
being transferred into and replicated in a host microorga- 
nism, such vector containing operational elements for the 
DNA sequencer 
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(c) Transferring the vector containing the synthetic 
DNA sequence and operational elements into a host microorga- 
nism capable of expressing the protease inhibitor r 

(d) Culturing the microorganism under conditions 
appropriate for amplification of the vector and expression 
of the inhibitor; 

(e) Harvesting the inhibitor; and 

( f ) Permitting the inhibitor to assume an active ter- 
tiary structure whereby it possesses serine protease inhib- 
itory activity. 

Synthetic DNA sequences contemplated for use in this 
method have been discussed in detail above. It is further con- 
templated, in an alternative embodiment, that natural DNA se- 
quences may also be used in this method. These sequences include 
cDNA or genomic DNA segments. In a preferred version of this em- 
bodiment, it is contemplated that the natural DNA sequence will 
be obtained hy a method comprising; 

(a) Preparation of a human cDNA library from cells, 
preferably parotid cells, capable of generating a serine 
protease inhibitor; 

(b) Probing the human DNA library with at least one 
probe capable of binding to the protease inhibitor gene or 
its protein product; 

(c) Identification of at least one clone containing 
the gene coding for the inhibitor by virtue of the ability 
of the clone to bind at least one probe for the gene or its 
protein product; 

(d) Isolation of the gene coding for the inhibitor 
from the clone or clones chosen; 

(e) Linking the gene, or suitable fragments thereof, 
to operational elements necessary to maintain and express 
the gene in host microorganism . 

The natural DNA sequences useful in the foregoing pro- 
cess may also be identified and isolated through a method com- 
prising : 

(a) Preparation of a human genomic DNA library, pref- 
erably propagated in a recArecB C E. coli host; 
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(b) Probing the human genomic DNA library with at 
least one probe capable of binding to a serine protein in- 
hibitor gene or its protein product; 

(c) Identification of at least one clone containing 
the gene coding for the inhibitor by virtue of the ability 
of the clone to bind at least one probe for the gene or its 
protein product; 

(d) Isolation of the gene coding for the inhibitor 
from the clone(s) identified; and 

(e) Linking the gene, or suitable fragments thereof, 
to operational elements necessary to maintain and express 
the gene in a host microorganism. 

In isolating a natural DNA sequence suitable for use in 
the above-method, it is preferred to identify the two restriction 
sites located within and closest to the end portions of the 
appropriate gene or sections of the gene. The DNA segment con- 
taining the appropriate gene is then removed from the remainder 
of the genomic material using appropriate restriction endo- 
nucleases. After excision, the 3' and 5 1 ends of the DNA se- 
quence are reconstructed to provide appropriate DNA sequences ca- 
pable of coding for the N— and C- termini of the serine protease 
inhibitor protein and capable of fusing the DNA sequence to its 
operational elements. 

The vectors contemplated for use in the present inven- 
tion include any vectors into which a DNA sequence as discussed 
above can be inserted, along with any preferred or required oper- 
ational elements, and which vector can then be subsequently 
transferred into a host microorganism and replicated in such mi- 
croorganism. Preferred vectors are those whose restriction sites 
have been well documented and which contain the operational ele- 
ments preferred or required for transcription of the DNA se- 
quence. 

The "operational elements," as discussed herein, in- 
clude at least one promoter, at least one operator, at least one 
leader sequence, at least one Shine-Dalgarno sequence, at least 
one terminator codon, and any other DNA sequences necessary or , 
preferred for appropriate transcription and subsequent 
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translation of the vector DNA. In particular, it is contemplated 
that such vectors will contain at least one origin of replication 
recognized by the host microorganism along with at least one 
selectable marker and at least one promoter sequence capable of 
initiating transcription of the synthetic DNA sequence. It is 
additionally preferred that the vector, in one embodiment, con- 
tains certain DNA sequences capable of functioning as regulators, 
and other DNA sequences capable of coding for regulator protein. 
These regulators, in one embodiment, serve to prevent expression 
of the synthetic DNA sequence in the presence of certain environ- 
mental conditions and, in the presence of other environmental 
conditions, allow transcription and subsequent expression of the 
protein coded for by the synthetic DNA sequence* In particular, 
it is preferred that regulatory segments be inserted into the 
vector such that expression of the synthetic DNA will not occur 
in the absence of, for example, isopropylthio- -d- gal ac to side. 
In this situation, the transformed microorganisms containing the 
synthetic DNA may be grown to a desired density prior to initia- 
tion of the expression of the protease inhibitor. In this em- 
bodiment, expression of the desired protease inhibitor is induced 
by addition of a substance to the microbial environment capable 
of causing expression of the DNA sequence after the desired den- 
sity has been achieved. 

Additionally, it is preferred that an appropriate se- 
cretory leader sequence be present, either in the vector or at 
the 5' end of the synthetic DNA sequence. The leader sequence is 
in a position which allows the leader sequence to be immediately 
adjacent to the initial portion of the nucleotide sequence capa- 
ble of directing expression of the protease inhibitor without any 
intervening translation termination signals. The presence of the 
leader sequence is desired in part for one or more of the follow- 
ing reasons: 1) the presence of the leader sequence may facili- 
tate host processing of the initial product to the mature recom- 
binant protease inhibitor? 2) the presence of the leader sequence 
may facilitate purification of the recombinant protease inhib- 
itors, through directing the protease inhibitor out of the cell 
cytoplasm; 3) the presence of the leader sequence may affect the 
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ability of the recombinant protease inhibitor to fold to its 
active structure through directing the protease inhibitor out of 
the cell cytoplasm. 

In particular, the leader sequence may direct cleavage 
of the initial translation product by a leader peptidase to re- 
move the leader sequence and leave a polypeptide with the pre- 
fered amino acid sequence which has the potential of serine pro- 
tease inhibitory activity. In some species of host 
microorganisms, the presence of the appropriate leader sequence 
will allow transport of the completed protein into the 
periplasraic space, as in the case of coli . In the case of 
certain yeasts and strains of Bacilli and Pseudomonas , the appro- 
priate leader sequence will allow transport of the protein 
through the cell membrane and into the extracellular medium. In 
this situation, the protein may be purified from extracellular 
protein - 

Thirdly, in the case of some of the protease inhibitors 
prepared by the present invention, the presence of the leader se- 
quence may be necessary to locate the completed protein in an en- 
vironment where it may fold to assume its active structure, which 
structure possesses the appropriate elastase- inhibitor activity. 

Additional operational elements include, but are not 
limited to, ribosome binding sites and other DNA sequences neces- 
sary for microbial expression of foreign proteins. In a pre- 
ferred embodiment of the present invention, the sequence 
GAGGCGCAAAAA(ATG) would be used as the ribosome binding site. 
The operational elements as discussed herein are routinely se- 
lected by those of ordinary skill in the art in light of prior 
literature and the teachings contained herein. General examples 
of these operational elements are set forth in B. Lewin, Genes , 
Wiley & Sons, New York (1983), which is specifically incorporated 
herein by reference. The vectors as contemplated herein may be 
constructed in part from portions of plasmids pBR322 and/ or pIQ. 

In one preferred embodiment of the present invention, 
an additional DNA sequence is located immediately preceding the 
synthetic DNA sequence Which codes for the protease inhibitor. 
The additional DNA sequence is capable of functioning as a 



SUBSTITUTE SHEET 



WO 86/03519 



PCT/US85/02385 



-27- 

translational coupler, i.e., it is a DNA sequence that encodes an 
RNA which serves to position ribosomes immediately adjacent to 
the ribosome binding site of the protease inhibitor RNA with 
which it is contiguous. In one embodiment of the present inven- 
tion, the translational coupler may be derived using the DNA se* 
quence TAACGAGGCGCAAAAAATGAAAAAGACAGCTATCGCGATCGGAGTGTAAGAAATG 
and methods currently known to those of ordinary skill in the art 
related to translational couplers. A second, preferred transla- 
tional coupler has the DNA sequence 

TAACGAGGCGCAAAAAATGAAAAAGACAGCTATCGCGATCAAGGAGAAATAAATG . 

Upon synthesis and isolation of all necessary and de- 
sired component parts of the above-discussed vector, the vector 
is assembled by methods generally known to those of ordinary 
skill in the art. Assembly of such vectors is believed to be 
within the duties and tasks performed by those with ordinary 
skill in the art and, as such, is capable of being performed 
without undue experimentation. For example, similar DNA se- 
quences have been ligated into appropriate cloning vectors, as 
set forth in Schoner et al . , Proceedings of the National Academy 
of Sciences U.S.A., 81:5403-5407 (1984), which is specifically 
incorporated herein by reference. 

In construction of the cloning vector of the present 
invention it should additionally be noted that multiple copies of 
the synthetic DNA sequence and its attendant operational elements 
may be inserted into each vector. In such an embodiment the host 
organism would produce greater amounts per vector of the desired 
protease inhibitor. The number of multiple copies of the DNA se- 
quence which may be inserted into the vector is limited only by 
the ability of the resultant vector, due to its size, to be 
transferred into and replicated and transcribed in an appropriate 
host microorganism. 

Additionally, it is preferred that the vector contain a 
selectable marker, such as a drug resistance marker or other 
marker which causes expression of a selectable trait by the host 
microorganism. In a particularly preferred embodiment of the 
present invention, the gene for tetracycline resistance is pref- 
erably included on the cloning vector. 



SUBSTITUTE SHEET 



WO 86/03519 



PCT/US85/02385 



-28- 

Such a drug resistance or other selectable marker is 
intended in part to facilitate in the selection of transformants . 
Additionally, the presence of such a selectable marker on the 
cloning vector may be of use in keeping contaminating microorga- 
nisms from multiplying in the culture medium. In this embodi- 
ment, such a pure culture of the transformed host microorganisms 
would be obtained by culturing the microorganisms under condi- 
tions which require the induced phenotype for survival. 

The vector thus obtained is then transferred into the 
appropriate host microorganism. It is believed that any 
micororganisra having the ability to take up exogenous DNA and ex- 
press those genes and attendant operational elements may be cho- 
sen. It is preferred that the host microorganism be a faculta- 
tive anaerobe or an aerobe. Particular hosts which may be 
preferable for use in this method include yeasts and bacteria. 
Specific yeasts include those of the genus Saccharomyces , and es- 
pecially Saccharomyces cerevisiae . Specific bacteria include 
those of the genera Bacillus , Escherichia , and Pseudomonas , espe- 
cially Bacillus subtilis and Escherichia coli . 

After a host organism has been chosen, the vector is 
transferred into the host organism using methods generally known 
by those of ordinary skill in the art. Examples of such methods 
may be found in Advanced Bacterial Genetics by R. W. Davis et 
al . , Cold Spring Harbor Press, Cold Spring Harbor, New York, 
(1980), which is specifically incorporated herein by reference. 
It is preferred, in one embodiment, that the transformation occur 
at low temperatures, as temperature regulation is contemplated as 
a means of regulating gene expression through the use of opera- 
tional elements as set forth above. In another embodiment, if 
osmolar regulators have been inserted into the vector, regulation 
of the salt concentrations during the transformation would be re- 
quired to insure appropriate control of the synthetic genes. 

If it is contemplated that the recombinant serine pro- 
tease inhibitors will ultimately be expressed in yeast, it is 
preferred that the cloning vector first be transferred into 
Escherichia coli , where the vector would be allowed to replicate 
and from which the vector would be obtained and purified after 
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amplification. The vector would then be transferred into the 
yeast for ultimate expression of the serine protease inhibitor. 

The host microorganisms are cultured under conditions 
appropriate for the expression of the serine protease inhibitor. 
These conditions are generally specific for the host organism, 
and are readily determined by one of ordinary skill in the art, 
in light of the published literature regarding the growth condi- 
tions for such organisms, for example Bergey' s Manual of 
Determinative Bacteriology, 8th Ed., Williams & Wilkins Company* 
Baltimore, Maryland, which is specifically incorporated herein by 
reference . 

Any conditions necessary for the regulation of the ex- 
pression of the DNA sequence, dependent upon any operational ele- 
ments inserted into or present in the vector, would be in effect 
at the trans formation and culturing stages. In one embodiment, 
the cells are grown to a high density in the presence of appro- 
priate regulatory conditions which inhibit the expression of the 
DNA sequence. When optimal cell density is approached, the envi- 
ronmental conditions are altered to those appropriate for expres- 
sion of the synthetic • DNA sequence. It is thus contemplated that 
the production of the protease inhibitor will occur in a time 
span subsequent to the growth of the host cells to near optimal 
density, and that the resultant protease inhibitor will be har- 
vested at some time after the regulatory conditions necessary for 
its expression were induced. 

In a preferred embodiment of the present invention, the 
recombinant protease inhibitor is purified subsequent to harvest- 
ing and prior to assumption of its active structure. This em- 
bodiment is preferred as the inventors believe that recovery of a 
high yield of re-folded protein is facilitated if the protein is 
first purified. However, in one preferred, alternate embodiment, 
the protease inhibitor may be allowed re- fold to assume its 
active structure prior to purification. In yet another pre- 
ferred, alternate embodiment, the protease inhibitor is present 
in its re- folded, active state upon recovery from the culturing 
medium. 
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In certain circumstances, the protease inhibitor will 
assume its proper, active structure upon expression in the host 
microorganism and transport of the protein through the cell wall 
or membrane or into the periplasmic space. This will generally 
occur if DNA coding for an appropriate leader sequence has been 
linked to the DNA coding for the recombinant protein. If the 
protease inhibitor does not assume its proper, active structure, 
any disulfide bonds Which have formed and/or any noncovalent in- 
teractions which have occurred will first be disrupted by 
denaturing and reducing agents, for example, guanidinium chloride 
and -mercaptoethanol , before the protease inhibitor is allowed 
to assume its active structure following dilution and oxidation 
of these agents under controlled conditions. 

It is to be understood that application of the teach- 
ings of the present invention to a specific problem or environ- 
ment will be within the capabilities of one having ordinary skill 
in the art in light of the teachings contained herein. Examples 
of the products of the present invention and representative pro- 
cesses for their isolation and manufacture appear in the follow- 
ing example* 
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EXAMPLE 1 

On the basis of the amino acid sequence described 
above, the codon usage in highly expressed genes of Escherichia 
col i , and the provision of convenient restriction endonuclease 
cleavage sites, the following DNA sequence was proposed: 
Hindlll 

5'AGC GGT AAA AGO TT C AAA GCT GGC GTA TGC CCG CCG 

Alul 

FnuD II Rsal Hpall 



AAA AAA TCC GCG CA G TGT CTG CGG TAC AAA AAA CCG 
Hhal 

Xmal 



GAA TGC CAG TCC GAC TGG CAG TGC CCG GG T AAA AAA 

Hpall 



Neil 

Neil 



CGT TGT T GC CCG GA C ACC TGC GGC ATC AAA T GC CTG 
Hpall Fnu4HI BstNI 

BamHI 

GAT CCG GT T GAT ACC CCG AAC CCG ACT C GT CGA AAA 
Hpall Tag I 

Neil Hpall ' Ball 

CCG GGT AAA T GC CCG GT A ACC TAT GGC CA G TGT CTG 
Hpall Neil Haelll 

ATG CTG AAC CCG CCG AAC TTC TGC GAA ATG GAC GGC 

Haelll 

Bglll 



CAG TGT AAA CGA GAT CT G AAA TGC TGT ATG GGT ATG 

Mbol 

Fnu4HI Neil 



TGC GGC AAA TCT TGT GTT TCC CCG GT A AAA GCA TAA 3' 

Hpall 

To regulate the expression of the protein in a form 
suitable for export to the periplasm of E^ coli , the following 
regulatory elements were proposed: a tac promoter for initiation 
of transcription at high levels; a lac operator for transcription 
regulation; a lac repressor ( lac I q ) , to be coded elsewhere on 
the plasmid; an GmpA Shine-Dalgarno sequence to initiate transla- 
tion at a high level; an OmpA leader to facilitate periplasmic 
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export of the product; an Ala of an Ala-Ser junction between the 
protein sequence encoded by these operator elements and that 
encoded by the structural genes described above to dictate cleav- 
age of the initial product to yield the mature leukocyte elastase 
inhibitor. All of these features are incorporated into the fol- 
lowing DNA sequence: 

CTGCA GCTGT TGACA ATTAA TCATC GGCTC GTCTC GTATA ATGTG ATAAC GAGGC 
GCAAA AAATG AAAAA GACAG CTATC GCGAT CGCAG TGGCA 
CTGGC TGGTT TCGCT ACCGT AGCGC AGGCC. 

To regulate the expression of the protein in a form 
such that the protein remains in the coli cytoplasm, the fol- 
lowing operational elements are proposed: the tac promoter; the 
lac operator, and the lac repressor (lac 1^); a consensus of 
Shine-Dalgarno sequences; and, to initiate a high level of trans- 
lation, a fragment of the OmpA leader peptide to be used as a 
translational coupler. The translational coupling sequence com- 
prises the DNA coding for the translation initiation region of 
the OmpA gene, the first eight amino acids of the OmpA leader 
peptide, the consensus Shine-Dalgarno sequence described above 
and a translational terminator. The translational coupling se- 
quence is to be inserted between the promoter and the translation 
initiation site of the serine protease inhibitor gene, 
overlapping the latter. All of these features are incorporated 
into the following DNA sequence: 

CTGCA GCTGT TGACA ATTAA TCATC GGCTC GTCTC GTATA ATGTG ATAAC GAGGC 
GCAAA AAATG AAAAA GACAG CTATC GCGAT CGGAG TGTAA GAAAT G. 
A » Construction of Gene Fragments 

To construct the above sequences, the following deoxy- 
ribonucleotides are synthesized using the ABI DNA synthesizer 
(Foster City, California). The products are purified by 
polyacryl amide gel electrophoresis as described in the ABI in- 
strument manual . They are 5 1 phosphorylated using T4 
polynucleotide kinase and ATP using standard means. 

The following group of oligonucleotide sequences are 
used to construct fragment Aa. 

Oligonucleotide Aal is: 
GCTGT TGACA ATTAA TCAT. 
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Oligonucleotide Aa2 is: 
CGGCT CGTAT AATGT GTGGA ATTGT GAGCG GATAA CAATT T. 

Oligonucleotide Aa3 is: 
CACAC ATAAC GAGGC GCAAA AA. 

Oligonucleotide Aa4 is: 
ATGAA AAAGA CAGCT ATCGC GATCG. 

Oligonucleotide Aa5 is: 
CAGTG GCACT GGCTG GTTTC GCTAC CGTAG CGCAG GCCAG CGGTA AA. 

Oligonucleotide Aa6 is: 
GAGGC GATGA TTAAT TGTCA ACAGC TGCA. 

Oligonucleotide Aa7 is: 
TCCGC TCACA ATTCC ACACA TTATA C. 

Oligonucleotide Aa8 is: 
CCTCG TTATG TGTGA AATTG TTA. 

Oligonucleotide Aa9 is: 
GCCAC TGCGA TCGCG ATAGC TGTCT TTTTC ATTTT TTGCG. 

Oligonucleotide AalO is: 
AGCTT TTACC GCTGG CCTGC GCTAC GGTAG CGAAA CCAGC CAGT. 
The following oligonucleotide sequences are assembled to make 
fragment Ab. 

Nucleotide Abl is: 
GCTGT TGACA ATTAA TCAT. 

Nucleotide Ab2 is: 
CGGCT CGTAT AATGT GTGGA ATTGT GAGCG GATAA CAATT T. 

Nucleotide Ab3 is: 
CACAC ATAAC GAGGC GCAAA AA. 

Nucleotide Ab4 is: 
ATGAA AAAGA CAGCT ATCGC GATCG* 

Nucleotide Ab5 is : 
GAGTG TAAGA AATGA GCGGT AAA. 

Nucleotide Ab6 is: 
GAGCC GATGA TTAAT TGTCA ACAGC TGCA. 

Nucleotide Ab7 is: 
TCCGC TCACA ATTCC ACACA TTATA C. 

Nucleotide Ab8 is: 
CCTCG TTATG TGTGA AATTG TTA. 

Nucleotide Ab9 is : 
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AGCTT TTACC GCTCA TTTCT TACAC TCCGA TCGCG ATAGC 
TGTCT TTTTC ATTTT TTGCG. 
The following are the oligonucleotide sequences assembled to con- 
struct fragment B. 

Oligonucleotide Bl is: 
AGCTT CAAAG CTGGC GTATG CCCGC CGAAA AAATC CGCG. 

Oligonucleotide B2 is: 
CAGTG TCTGC GGTAC AAAAA ACCGG AATGC CAG* 

Oligonucleotide B3 is: 
TCCGA CTGGC AGTGC CCGGG TAAAA AACGT TGTTG C, 

Oligonucleotide B4 is: 
CCGGA CACCT GCGGC ATCAA ATGCC TG. 

Oligonucleotide B5 is: 
GATCC AGGCA TTTGA TGCCG CAGGT GTCCG GGCAA CAACG TTTTT 
TACCC GGGCA* 

Oligonucleotide B6 is: 
CTQCC AGTCG GACTG GCATT CCGGT TTTTT GTACC G, 

Oligonucleotide B7 is: 
CAGAC ACTGC GCGGA TTTTT TCGGC GGGCA TACGC CAGCT TTGA. 
The following are the oligonucleotide sequences used to construct 
fragment C. 

Oligonucleotide CI 
GATCC GGTTG ATACC CCGAA 

Oligonucleotide C2 
ACTCG TCGAA AA. 

Oligonucleotide C3 
CCGGG TAAAT GCCCG GTAAC 

Oligonucleotide C4 
CAGTG TCTGA TGCTG AACCC 

Oligonucleotide C5 
TTCTG CGAAA TGGAC GGCCA 

Oligonucleotide C6 
CTAGA TCTCG TTTAC ACTGG 

Oligonucleotide C7 
CGGCG GGTTC AGCAT CAGAC 

Oligonucleotide C8 
TTTAC CCGGT TTTCG ACGAG 



is: 
CCCG. 
is : 

is : 

CTATG GC. 
is: 

GCCGA AC. 
is: 

GTGTA AACGA GAT* 
is: 

CCGTC CATTT CGCAG AAGTT. 
is: 

ACTGG CCATA GGTTA CCGGG CA. 
is: 

TCGGG TT* 
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Oligonucleotide C9 is: 
CGGGG TATCA ACCG. 
The following group of oligonucleotide sequences are assembled to 
form fragment D. 

Oligonucleotide Dl is: 
GATCT GAAAT GCTGT ATGGG TATGT GCGGC. 

Oligonucleotide D2 is: 
AAATC TTGTG TTTCC CCGGT AAAAG CATAA G. 

Oligonucleotide D3 is: 
TCGAC TTATG CTTTT ACCGG GGAAA CACAA GATTT GCCGC A. 

Oligonucleotide D4 is: 
CATAC CCATA CAGCA TTTCA. 

The following groups of oligonucleotides are mixed and 
annealed under standard conditions and ligated to each other and 
to cloning and sequencing vectors M13 mpl8 and 19 cut with appro- 
priate restriction endonucleases using T4 DNA ligase under stan- 
dard conditions. The products are used to transform E« coli 
JM105 and clones containing the DNA of interest are selected from 

white plaques in IPTG- Xgal plates, and further screened by hy- 
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bridization with P labelled oligonucleotides selected from the 
group used in the annealing step. The insert structure is con- 
firmed by dideoxy sequencing of the cloned DNA using a universal 
primer . 

Group Aa contains oligonucleotides Aal-AalO which are 
ligated to M13 mpl8 and 19 cut with Pstl and Hindlll. Group Ab 
contains oligonucleotides Abl-Ab9, which are ligated to M13 mpl8 
and 19 cut with Pstl and Hindlll. Group B, which contains 
oligonucleotides Bl to B7, is ligated to Ml 3 mpl8 and 19 cut with 
Hin dlll and BamHI. Group C, which contains oligonucleotides CI 
to C9, is ligated to M13 mpl8 and 19 cut with BamH I and Xbal . 
Group D, which contains oligonucleotides Dl to D4, is ligated to 
M13 mpl8 and 19 cut with BamH I and Sail . 

M13 replicative form DNA is recovered from the clone 
having the desired insert DNA by standard means. The insert DNA 
corresponding to Group Aa is excised from the M13 DNA by cutting 
the DNA with appropriate restriction endonucleases and is 
purified by polyacrylamide gel electrophoresis. Its structure 
is: 
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AATTCGAGCTC GGT ACCCGGGGATCCTCTAGAGTC GACCTGC AGCTG 

GCTCGAGCCATGGGCCCCTAGGAGATCTCAGCTGGACGTCGAC * 

TTGACAATTAATCATCGGCTCGTATAATGTGTGGAATTGTGAGCG 
AACTGTTAATTAGTAGCCGAGCATATTACACACCTTAACACTCGC 

GATAACAATTTCACACATAACGAGGCGCAAAAA 
CTATTGTTAAAGTGTGTATTGCTCCGCGTTTTT 

ATGAAAAAGACAGCTATCGCGATCGCAGTGGCACTGGCT 
TACTTTTTCTGTCGATAGCGCTAGCGTCACCGTGACCGA 

GGTTTCGCTACCGTAGCGCAGGCCAGC 
CCAAAGCGATGGCATCGCGTCCGGTCG 

GGTAAA 
CCATTTTCGA 

The insert DNA corresponding to Group Ab is excised by 
cutting the DNA with restriction endonucleases EcoRI and Hind lll 
and is purified by polyacrylamide gel electrophoresis. Its 
structure is : 

AATTCGAGCTCGGTACCCGGGGATCCTCTA 
GCTCGAGCCATGGGCCCCTAGGAGAT 

GAGTCGACCTGCAGCTGTTGACAATTAATC 
CTCAGCTGGACGTCGACAACTGTTAATTAG 

ATC GGCTCGTATAATGT GT GGAATTGT GAG 
TAGCCGAGCATATTACACACCTTAACACTC 

CGGATAACAATTTCACACATAACGAGGCGC 
GCGTATTGTTAAAGTGTGTATTGCTCCGCG 

AAAAAAT GAAAAAGACAGCTATCGC GATC GG 
TTTTTTACTTTTTCTGTCGATAGCGCTAGCC 

AGT GTAAGAAAT GAGC GGTAAA 
TCACATTCTTTACTCGCCATTTTCGA 

The insert DNA corresponding to Group B is excised by 
cutting the DNA with restriction endonucleases Hindlll and BamHI 
and is purified by polyacrylamide gel electrophoresis* Its 
structure is : 

AGCTTCAAAGCTGGCGTATGCCCGCCG 
AGTTTCGACCGCATACGGGCGGC 

AAAAAATCCGCGC AGT GTCTGC GGTACAAA 
TTTTTTAGGC GCGTCACAGACGCCATGTTT 
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AAACCGGAATGCCAGTCCGACTGGCAGTGC 
TTTGGCCTTACGGTCAGGCTGACCGTCACG 

CCGGGTAAAAAACGTTGTTGCCCGGACACC 
GGCCCATTTTTTGCAACAACGGGCCTGTGG 

TGCGGCATCAAATGCCTG 
ACGCCGTAGTTTACGGACCTAG 

The insert DNA corresponding to Group C is excised by 
cutting the DNA with restriction endonucleoases BanjHI and Bgll l 
and is purified by polyacrylamide gel electrophoresis. Its 
structure is: 

GATCCGGTTGATACCCCGAACCCGACT 
GCCAACTATGGGGCTTGGGCTGA 

CGTCGAAAACCGGGTAAATGCCCGGTA 
GCAGCTTTTGGCCCATTTACGGGCCAT 

ACCTATGGCCAGTGTCTGATGCTGAACCCG 
TGGATACCGGTCACAGACTACGACTTGGGC 

CCGAACTTCTGCGAAATGGACGGCCAGTGT 
GGCTTGAAGACGCTTTACCTGCCGGTCACA 

AAACGA 
TTTGCTCTAG 

The insert DNA corresponding to Group D is excised by 
cutting the DNA with restriction endonucleases SauIIIA and Sail 
and is purified by acrylamide gel electrophoresis. Its structure 
is: 

GATCTGAAAT GCTGTATGGGTATG 
ACTTTACGACATACCCATAC 

TGCGGCAAATCTTGTGTTTCCCCG 
ACGCCGTTTAGAACACAAAGGGGC 

GTAAAAGCATAAG 
CATTTTCGTATTCAGCT 

B. Construction of the Gene 

In the construction for export, the inserts from group 
Aa, B, C, and D are combined and ligated to M13 mpl8 and 19 cut 
with EcoR I and Sail using T4 DNA ligases under standard condi- 
tions. The clones containing the gene are selected by their 

color on Xgal plates and screened further by hybridization with 
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the p labelled oligonucleotide. The structure of selected 
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clones is confirmed by dideoxy sequencing of the insert region of 
the DNA using the universal primer* 

In the construction for cytoplasmic expression r the in- 
serts from groups Ab, B # C, and D are combined and ligated to M13 
rapl8 and 19 cut with EcoRI and Sail using T4 DNA ligase under 
standard conditions. The clones containing the genes are se- 
lected by their color on Xgal plates and screened further by hy- 

32 

bridization with the P labelled insert. The structures of se- 
lected clones are confirmed by dideoxy sequencing of the insert 
region of the DNA using the universal primer. 

EXAMPLE 2 

On the basis of the amino acid sequence described 
above , the codon usage in highly expressed genes of Escherichia 
coli , and the provision of convenient restriction endonuclease 
cleavage sites, the following DNA sequence was proposed; 
Hindll I 

5 ' TCT GGT AAA AGC TT C AAA GCT GGC GTA TGC CGG CCG 

Alul 

FnuDII Rsal Hpall 

AAA AAA TCC GCG GA G TGT CTG CGG TAG AAA AAA CGG 
Hhal 

Xmal 



GAA TGC CAG TCC GAG TGG CAG TGC CCG GG T AAA AAA 

Hpall 



Neil 

Hpall _ 
CGT TGT T GC CCG GA C ACC TGC GGC ATC AAA T GC CTG 
Neil Pnu4HI BstNI 

BamHI 

GAT CCG GT T GAT ACC CCG AAC CCG ACT C GT CGA AAA 
Hpall TaqI 

Hpall Hpall Ball 

CCG GGT AAA T GC CCG GT A ACC TAT GGC CA G TGT CTG 
Neil Neil Haelll 

ATG CTG AAC CCG CCG AAC TTC TGC GAA ATG GAC GGC 

Haelll 

Bglll 



CAG TGT AAA CGA GAT .CTG AAA TGC TGT ATG GGT- ATG 

Mbol 
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Fnu4HI Neil 
TGC GGC AAA TCT TGT GTT TCC CCG GT A AAA GCA TAA 3' 

Hpall 

To regulate the expression of the protein in a form 
suitable for export to the periplasm of coli , the following 
regulatory elements are proposed: a tac promoter on plasmid 
pKK223-3 for initiation of transcription at high levels; a lac 
operator on plasmid pKK223-3 for transcription regulation? a lac 
repressor ( lac I q ), to be encoded on the chromosome of E. coli 
strain JM107; an OmpA Shine-Dalgarno sequence to initiate trans- 
lation at a high level; an OmpA leader to facilitate periplasmic 
export of the product; an Ala of an Ala-Ser junction between the 
protein sequence encoded by these operator elements and that 
encoded by the structural genes described above to dictate cleav- 
age of the initial product to yield the mature leukocyte elastase 
inhibitor. The ompA elements are incorporated into the following 
DNA sequence: 

GAATT CGATA TCTCG TTGGA GATAT TCAT GACGT ATTTT GGATG ATAAC GAGGC 
GCAAA AAATG AAAAA GACAG CTATC GCGAT CGCAG TGGCA 
CTGGC TGGTT TCGCT ACCGT AGCGC AGGCC. 

To regulate the expression of the protein in a form 
such that the protein remains in the coli cytoplasm, the fol- 
lowing operational elements are proposed: the tac promoter on 
plasmid pKK223-3; the lac operator of plasmid pKK223-3 and the 
lac repressor (lac I q ) on the chromosome of E« coli strain JM107; 
a consensus Shine-Dalgarno sequence; and, to initiate a high 
level of translation, a fragment of the OmpA leader peptide to be 
used as a translational coupler. The translational coupling se- 
quence comprises the DNA coding for the translation initiation 
region of the OmpA gene, the first eight amino acids of the OmpA 
leader peptide, the consensus Shine-Dalgarno sequence described 
above and a translational terminator. The translational coupling 
sequence is to be inserted between the lac operator and the 
translation initiation site of the serine protease inhibitor 
gene, overlapping the latter. The features of the translational 
coupler are incorporated into the following DNA sequence: 
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GAATT CGATA TCTCG TTGGA GATAT TTCAT GACGT ATTTT GGATG ATAAC GAGGC 
GCAAA AAATG AAAAA GACAG CTATC GCGAT CAAGG AGAAA TAAAT G. 
C. Construction of Gene Fragments 

To construct the above sequences, the following deoxy- 
ribonucleotides are synthesized using the ABI DNA synthesizer 
(Poster City, California). The products are purified by 
polyacrylamide gel electrophoresis as described in the ABI in- 
strument manual . They are 5 1 phosphorylated using T4 
polynucleotide kinase and ATP using standard means • 

The following group of oligonucleotide sequences are 
used to construct fragment Aa. 

Oligonucleotide Aal is: 
AATTCGATATCTCGTTGGAGATATTCATGACGTATTTTGGATGATAACGAGGCGCAAAA. 

Oligonucleotide Aa2 is: 
ATGAAAAAGACAGCTATCGC GATCG . 

Oligonucleotide Aa3 is: 
GATCCGATCGCGATAGCTGTCTTTTTCATTTTTTGC . 

Oligonucleotide Aa4 is: 
GCCTCGTTATCATCCAAAATACGTCATGAATATCTCCAACGAGATATCG. 

Oligonucleotide Aa5 is: 
GATCCGATCGCAGTGGCACTGGCTGGTTTCGCTACCGTAGCGCAGGCCTCTGGTAAA. 

Oligonucleotide Aa6 is: 
AGCTTTTACCAGAGGCCTGCGCTACGGTAGCGAAACCAGCCAGTGCCACTGCGATCG- 
The following oligonucleotide sequences are assembled 
to make up fragment Ab. 

Oligonucleotide Abl is: 
AATTCGATATCTCGTTGGAGATATTCATGACGTATTTTGGATGATAACGAGGCGCAAAA. 

Oligonucleotide Ab2 is: 
ATGAAAAAGACAGCTATCGCGATCG • 

Oligonucleotide Ab3 is: 
GATCCGATCGCGATAGCTGTCTTTTTCATTTTTTGC . 

Oligonucleotide Ab4 is : 
GCCTCGTTATCATCCAAAATACGTCATGAATATCTCCAACGAGATATCG. 

Oligonucleotide Ab5 is: 
C AAGGAGAAATAAATGAGCGGTAAA • 

Oligonucleotide Ab6 is: 
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AGCTTTTACCGCTCATTTATTTCTCCTTGAT • 
The following are the oligonucleotide sequences assembled to con- 
struct fragment B. 

Oligonucleotide Bl is: 
AGCTT CAAAG CTGGC GTATG CCCGC CGAAA AAATC CGCG. 

Oligonucleotide B2 is: 
CAGTG TCTGC GGTAC AAAAA ACCGG AATGC CAG. 

Oligonucleotide B3 is: 
TCCGA CTGGC AGTGC CCGGG TAAAA AACGT TGTTG C. 

Oligonucleotide B4 is: 
CCGGA CACCT GCGGC ATCAA ATGCC TG« 

Oligonucleotide B5 is: 
GATCC AGGCA TTTGA TGCCG CAGGT GTCCG GGCAA CAACG TTTTT 
TACCC GGGCA. 

Oligonucleotide B6 is: 
CTGCC AGTCG GACTG GCATT CCGGT TTTTT GTACC G. 

Oligonucleotide B7 is: 
CAGAC ACTGC GCGGA TTTTT TCGGC GGGCA TACGC CAGCT TTGA. 
The following are the oligonucleotide sequences used to construct 
fragment C. 

Oligonucleotide CI is: 
GATCC GGTTG ATACC CCGAA CCCG. 

Oligonucleotide C2 is: 
ACTCG TCGAA AA. 

Oligonucleotide C3 is: 
CCGGG TAAAT GCCCG GTAAC CTATG GC. 

Oligonucleotide C4 is: 
CAGTG TCTGA TGCTG AACCC GCCGA AC. 

Oligonucleotide C5 is: 
TTCTG CGAAA TGGAC GGCCA GTGTA AACGA GAT. 

Oligonucleotide C6 is: 
CTAGA TCTCG TTTAC ACTGG CCGTC CATTT CGCAG AAGTT. 

Oligonucleotide C7 is: 
CGGCG GGTTC AGCAT CAGAC ACTGG CCATA GGTTA CCGGG CA. 

Oligonucleotide C8 is: 
TTTAC CCGGT TTTCG ACGAG TCGGG TT, 

Oligonucleotide C9 is: 



SUBSTITUTE SHEET 



WO 86/03519 



42 



PCT/US85/02385 



CGGGG TATCA ACCG. 
The following group of oligonucleotide sequences are assembled to 
form fragment D. 

Oligonucleotide Dl is: 
GATCT GAAAT GCTGT ATGGG TATGT GCGGC. 

Oligonucleotide D2 is: 
AAATC TTGTG TTTCC CCGGT AAAAG CATAA G. 

Oligonucleotide D3 is: 
TCGAC TTATG CTTTT ACCGG GGAAA CACAA GATTT GCCGC A. 

Oligonucleotide D4 is: 
CATAC CCATA CAGCA TTTCA. 

The following groups of oligonucleotides are mixed and 
annealed under standard conditions and ligated to each other and 
to cloning and sequencing vectors Ml 3 mpl8 and 19 cut with appro- 
priate restriction endonucleases using T4 DNA ligase under stan- 
dard conditions. The products are used to transform E . coli 

JM105 and clones containing^ the DNA of interest are selected by 
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hybridization with P labelled oligonucleotides selected from 
the group used in the annealing step. The insert structure is 
confirmed by dideoxy sequencing of the cloned DNA using a univer- 
sal primer. 

Oligonucleotides Aal-Aa4 are ligated to M13mpl8 and 
M13mpl9 cut with EcoRI and BamH I. M13 replicative form DNA 
having the desired insert DNA is recovered by standard means. 
The insert DNA is excised from the M13 DNA by cutting the M13 DNA 
with restriction endonucleases EcoR I and Pvul and is purified by 
polyacryl amide gel electrophoresis. Its structure is: 
AATTCGATATCTCGTTGGAGATATTCATGACGTATTTTGGATGATAACGAGGCGCAAAAAATGA 
GCTATAGAGCAACCTCTATAAGTACTGCATAAAACCTACTATTGCTCCGCGTTTTTTACT 
AAAAGACAGCTATCGCGAT 
TTTTCTGTCGATAGCGC 

Oligonucleotides Aa5 and Aa6 are ligated to M13mpl8 and 
m!3mpl9 cut with BamH I and Hindlll. M13 replicative form DNA 
having the desired insert DNA is recovered by standard means., 
The insert DNA is excised from the Ml 3 DNA by cutting the DNA 
with restriction endonucleases Pvu l and Hindlll and is purified 
by polyacrylamide gel electrophoresis. Its structure is: 
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CGCAGTGGCACTGGCTGGTTTCGCTACCGTAGCGCAGGCCTCTGGTAAA 
TAGCGTCACCGTGACCGACCAAAGCGATGGCATCGCGTCCGGAGACCATTTTCGA 
This Pvul-Hindlll fragment is combined with the EcoR I -PvuI frag- 
ment prepared from oligonucleotides Aal-Aa4 and ligated with 
M13mpl8 or M13mpl9 cut with EcoRI and Hindlll. M13 replicative 
form DNA having the desired insert DNA is recovered by standard 
means. The insert DNA, which is DNA Fragment Aa, is excised from 
the M13 DNA by cutting the M13 DNA with restriction endonucleases 
EcoR I and Hindlll and is purified by polyacrylaraide gel 
electrophoresis* Its structure is: 

AATTCGATATCTCGTTGGAGATATTCATGACGTATTTTGGATGATAACGAGGCGCAAAAAATGA 
GCTATAGAGCAACCTCTATAAGTACTGCATAAAACCTACTATTGCTCCGCGTTTTTTACT 
AAAAGACAGCTATCGCGATCGCAGTGGCACTGGCTGGTTTCGCTACCGTAGCGCAGGCCTCTGGTA 
TTTTGTGTCGATAGCGCTAGCGTCACCGTGACCGACCAAAGCGATGGCATCGCGTCCGGAGACCAT 
AA 

TTTCGA 

Oligonucleotides Abl-Ab4 are ligated to M13mpl8 and 
M13mpl9 cut with EcoR I and BamHI. M13 replicative form DNA 
having the desired insert DNA is recovered by standard means. 
The insert DNA is excised from the M13 DNA by cutting the DNA 
with restriction endonucleases EcoR I and Pvul and is purified by 
polyacrylamide gel electrophoresis. Its structure is: 
AATTCGATATCTCCTTGGAGATATTCATGACGTATTTTGGATGATAACGAGGCGCAAAAAATGA 
GCTATAGAGCAACCTCTATAAGTACTGCATAAAACCTACTATTGCTCCGCGTTTTTTACT 
AAAAGACAGCTATCGCGAT 
TTTTCTGTC GATAGC GC 

This EcoR I -PvuI fragment is combined with oligonucleotides Ab5 
and Ab6 and ligated with Ml3mpl8 or M13mpl9 cut with EcoR I and 
Hind lll. M13 replicative form DNA having the desired insert DNA 
is recovered by standard means. The insert DNA which is Fragment 
Ab is excised from the M13 DNA by cutting the DNA with restric- 
tion endonucleases Eco RI and Hin dlll and is purified by 
polyacrylamide gel electophoresis . Its structure is: 
AATTCGATATCTCGTTGGAGATATTCATGACGTATTTTGGATGATAACGAGGCGCAAAAAATGA 
GCTATAGAGCAACCTCTATAAGTACTGCATAAAACCTACTATTGCTCCGCGTTTTTTACT 
A AAAGACAGC TATC GC GATC AAGGAGAAATAAAT GAGC GGTAAA 
TTTTCTGTCGATAGCGCTAGTTCCTCTTTATTTACTCGCCATTTTCGA 
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Group B, which contains oligonucleotides Bl to B7, is 
ligated to M13mp 18 and 19 cut with Hind llX and BamH I. The in- 
sert DNA corresponding to Group B is excised by cutting the DNA 
with restriction endonucleases Hindlll and BamH I and is purified 
by polyacrylamide gel electrophoresis * Its structure is: 
AGCTTCAAAGCTGGCGTATGCCCGCCG 

AGTTTCGACCGCATACGGGCGGC 
AAAAAATCCGCGCAGTGTCTGCGGTACAAA 
TTTTTTAGGCGCGTCACAGACGCCATGTTT 
AAACCGGAATGCCAGTCCGACTGGCAGTGC 
TTTGGCCTTACGGTCAGGCTGACCGTCACG 
CCGGGTAAAAAACGTTGTTGCCCGGACACC 
GGCCCATTTTTTGCAACAACGGGCCTGTGG 
TGCGGCATCAAATGCCTG - 
ACGCCGTAGTTTACGGACCTAG 

Group C, which contains oligonucleotides CI to C9, is 
ligated to M13mp 18 and 19 cut with BamH I- and Xbal . The insert 
DNA corresponding to Group C is excised by cutting the DNA with 
restriction endonucleoases BamH I and Bgll l and is purified by 
polyacrylamide gel electrophoresis. Its structure is: 
GATCCGGTTGATACCCCGAACCCGACT 

GCCAACTATGGGGCTTGGGCTGA 
CGTC GAAAACCGGGT AAATGCCCGGT A 
GCAGCTTTTGGCCC ATTTAC GGGCCAT 
ACCTATGGCCAGTGTCTGATGCTGAACCCG 
T GGATACC GGTC ACAGACTACGACTTGGGC 
CCGAACTTCTGCGAAATGGACGGCCAGTGT 
GGCTTGAAGACGCTTTACCTGCCGGTCACA 
AAACGA 
TTTGCTCTAG 

Group D, which contains oligonucleotides Dl to D4, is 
ligated to M13mp 18 and 19 cut with BamH I and Sail . The insert 
DNA corresponding to Group D is excised by cutting the DNA with 
restriction endonucleases SauIIIA and Sai l and is purified by 
acrylamide gel electrophoresis . Its structure is : 
GATCTGAAATGCTGTATGGGTATG 
ACTTTACGACATACCCATAC 
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TGCGGCAAATCTTGTGTTTCCCCG 
ACGCCGTTTAGAACACAAAGGGGC 
GTAAAAGCATAAG 
CATTTTCGTATTCAGCT 

D. Construction of the Gene 

In the construction for export, the inserts from Groups 
Aa, B, C, and D are combined and ligated to M13 mpl8 and 19 cut 
with EcoRI and Sail using T4 DNA ligase under standard condi- 
tions • In the construction for cytoplasmic expression, the in- 
serts from Groups Ab, B, C and D are combined and ligated to M13 
mpl8 and 19 cut with EcoR I and Sail using T4 DNA ligase under 
standard conditions. The clones containing the gene are selected 

by their color on Xgal plates and screened further by hy- 
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bridization with the P labelled oligonucleotide. The structure 
of selected clones is confirmed by dideoxy sequencing of the in- 
sert region of the DNA using the universal primer. 

EXAMPLE 3 
Construction of Expression Vectors 

The inserts for the construction for export and the 
construction for cytoplasmic expression were transferred to ex- 
pression plasmids as follows. M13 replicative form DNA having 
the desired insert DNA is recovered by standard means as indi- 
cated above. The appropriate insert DNA is excised from the M13 
DNA by cutting the DNA with restriction endonucleases EcoR I and 
PstI and is purified by polyacryl amide gel electrophoresis. It 
is then ligated to pKK223-3 cut with restriction endonucleases 
EcoR I and Pst I and the resulting plasmid cloned into E. coli 
JM107. The construction for use in Examples 4 and 5 for export 
is pSGE6 and that for use in Example 7 for cytoplasmic expression 
is pSGE8. The E. coli strain for export in Examples 4 and 5 is 
SGE10 and that for cytoplasmic expression in Example 6 is SGE30 . 

A. Organization of pSGE6 

Plasmid pSGE6 was constructed by replacing the DNA be- 
tween the EcoR I and PstI sites of pKK223-3 with an EcoRl/PstI 
fragment containing DNA coding for ompA SLPI. The DNA sequence 
of ompA-SLPI is as follows: 
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10 


20 


30 




DO 


DU 


GAATTCGATA 


TCTCGTTGGA 


GATATTCATG 


ACGT ATTTT G 


GATGATAACG 




CTTAAGCTAT 


AGAGCAACCT 


CTATAAGTAC 


TGCATAAAAC 


CTACTATTGC 


TCCGCGTTTT 


70 


80 


. 90 


100 


110 


120 


AATGAAAAAG 


ACAGCTATCG 


CGATCGCAGT 


GGCACTGGCT 


GGTTTCGCTA 


CCGTAGC GC A 


TTACTTTTTC 


TGTCGATAGC 


GCTAGCGTCA 


CCGTGACCBA 


CCAAAGCGAT 


GGCATCGCGT 


130 


140 


150 


160 


170 


180 


GGCCTCTGGT 


AAAAGCTTCA 


AAGCTGGCGT 


ATGCCCGCCG 


AAAAAATCCG 


CGCAGTGTCT 


CCGGAGACCA 


TTTTCGAAGT 


TTCGACCGCA 


TAGGGGCGGC 


TTTTTTAGGC 


GCGTCACAGA 


190 


200 


210 


220 


230 


240 


GCGGTACAAA 


AAACCGGAAT 


GCCAGTCCGA 


CTGGCAGTGC 


CCGGGTAAAA 


AACGTTGTTG 


CGC CAT GTTT 


TTTGGCCTTA 


CGGTGAGGCT 


GACCGTCACG 


GGCCCATTTT 


TTGCAACAAC 


250 


260 


270 


280 


290 


300 


CCCGGACACC 


TGCGGCATCA 


AATGCCTGGA 


TCCGGTTGAT 


ACCCCGAACC 


CGACTCOTCG 


GGGCCTGTGG 


ACGCCGTAGT 


TTACGGACCT 


AGGCCAACTA 


TGGGGCTTGG 


GCTGAGCAGC 


310 


320 


330 


340 


350 


360 


AAAACCGGGT 


AAATGCGCGG 


TAACCTATGG 


CCAGTGTCTG 


ATGCTGAACC 


CGCCGAACTT 


TTTTGGCCCA 


TTTACGGGCC 


ATTGGATACC 


GGTCACAGAC 


TACGACTTGG 


GCGGCTTGAA 


370 


380 


390 


400 


410 


420 


CTGCGAAATG 


GACGGCCAGT 


GTAAACGAGA 


TCTGAAATGC 


TGTATGGGTA 


TGTGCGGCAA 


GACGCTTTAC 


CTGCCGGTCA 


CATTTGCTCT 


AGACTTTACG 


ACATACCCAT 


ACACGCCGTT 


430 


440 


450 


460 






ATCTTGTGTT 


TCCCCGGTAA 


AAGCATAAGT 


CGACCTGCAG 






TAGAACACAA 


AGGGGCCATT 


TTCGTATTCA 


GCTGGACGTC 







The sequence hereinafter referred to as "ampA^SLPI" is the DNA 
from the final M13mpl8 construct for export discussed above. 
Plasmid pSGE6 is depicted in Figure 1. In Fig* 1, the first, 
codon for ompAss-SLPI is at position 62-64 of the DNA sequence 
called "ompA-SLPI." The first codon for mature SLPI is at posi- 
tion 125-127. Ptac contains DNA for the tac promoter, lac. opera- 
tor and the beta galactosidase Shine/Dal garno sequence. The 
abbreviations Rl, Pst and Bam are recognition sequences for the 
restriction enzymes EcoRI, PstI and BamH I. Tet r is a part of the 
gene from pBR322 which confers resistance to tetracycline, amp 
confers resistance to ampicillin, rrnB contains the DNA from the 
rrnB operon from position 6416 to position 6840. Arrows indicate 
the direction of transcription. 
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B. Organization of pCJ-ompA-SLPI 

Plasmid pCJ-ompA-SLPI is the same as pSGE6 except that 
it contains the complete tetracycline resistance gene and promot- 
er rather than the partial gene. This plasmid confers 
tetracycline resistance when inserted into E. coli and was con- 
structed in a analogous fashion to pSGE6 except that the 
EcoRl/PstI fragment containing DNA coding for ompA SLPI was 
cloned into vector pCJl rather than pKK223-3. The vector pCJl 
was constructed as follows. Plasmid pKK223-3 was digested com- 
pletely with SphI and partially with BamHI. A 4.4Kbp fragment 
was gel purified and combined with a synthetic adaptor: 
GATCTAGAATTGTCATGTTTGACAGCTTATCAT 

ATCTTAACAGTACAAACTGTCGAATAGTAGC 
and a 539 bp fragment of DNA from a Clal , SphI digest of the tet r 
gene of pBR322 (PL Biochemicals, 27-4891-01). 

C. Structure of pSGE8 

Plasmid pSGE8 is isogenic to pSGE6 with the exception 
that the DNA between the EcoR I and Pst sites contains the se- 
quence called cmpA-tc-met-SLPI which is derived from the final 
M13mpl8 construct for cytoplasmic expression as discussed above. 
This sequence directs the synthesis of methionyl-SLPI in the 
cytoplasm of E. coli. A partial diagram of pSGE8 is contained in 
Fig. 2. In the sequence called "ompA-tc-met-SLPI f " the 
initiation codon for ompA is at position 62-64, the termination 
codon is at 95-97, and the initiation codon for methionyl-SLPI is 
at 98-100. The DNA sequence of ompA-tc-met-SLPI is as follows: 



10 


20 


30 


40 


50 


60 


GAATTCGATA 


TCTCGTTGGA 


GATATTCATG 


ACGTATTTTG 


GATGATAACG 


AGGCGCAAAA 


CTTAAGCTAT 


AGAGCAACCT 


CTATAAGTAC 


TGCATAAAAC 


CTACTATTGC 


TCCGCGTTTT 


70 


30 


90 


100 


110 


120 


AATGAAAAAG 


ACAGCTATCG 


CGATCAAGGA 


GAAATAAATG 


AGCGGTAAAA 


GCTTCAAAGC 


TTACTTTTTC 


TGTCGATAGC 


GCTAGTTCCT 


CTTTATTTAC 


TCGCCATTTT 


CGAAGTTTCG 


130 


140 


150 


160 


170 


180 


TGGCGTATGC 


CCGCCGAAAA 


AATCCGCGCA 


GTGTCTGCGG 


TACAAAAAAC 


CGGAATGCCA 


ACCGCATACG 


GGCGGCTTTT 


TTAGGCGCGT 


CACAGACGCC 


ATGTTTTTTG 


GCCTTACGGT 


190 


200 


210 


220 


230 


240 


GTCCGACTGG 


CAGTGCCCGG 


GTAAAAAACG 


TTGTTGCCCG 


GACACCTGCG 


GCATCAAATG 


CAGGCTGACC 


GTCACGGGCC 


CATTTTTTGC 


AACAACGGGC 


CTGTGGACGC 


CGTAGTTTAC 
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250 260 270 280 290 300 * 

CCTGGATCCG CTTGATACCC CGAACCCGAC TCGTCGAAAA CCGGGTAAAT GCCCGGTAAC 

GGACCTAGGC CAACTATGGG GCTTGGGCTG AGCAGCTTTT GGCCCATTTA CGGGCCATTG 

310 320 330 340 350 360 

CTATGGCCAG TGTCTGATGC TGAACCCGCC GAACTTCTGG GAAATGGACG GCCAGTGTAA 
GATACCGGTC ACAGACTACG ACTTGGGCGG CTTGAAGACG CTTTACCTGC CGGTCACATT 

370 380 390 400 410 420 

ACGAGATCTG AAATGCTGTA TGGGTATGTG CGGCAAATCT TGTGTTTCCC CGGTAAAAGC 
TGCTCTAGAC TTTACGACAT ACCCATACAC GCCGTTTAGA ACACAAAGGG GCCATTTTCG 

430 

ATAAGTCGAC CTGCAG 
TATTCAGCTG GACGTC 

I>. Organization of pCJ-met-SI^I 

Plasmid pCJ-met-SLPI is the same as pSGE8 except that 
it contains the complete (rather than the partial) tetracycline 
resistance gene. Plasmid CJ-met-SLPI was constructed analogously 
to pSGE8 except that the EcoRl/PstI fragment containing DNA 
coding for ompA-tc~met-SLPI was cloned into vector pCJl rather 
than pKK223-3. 

E. Construction of Yeast Expression Plasmids 

The plasmid pUC8 was digested with Hind lll and ligated 
to a Hindlll/Smal adaptor (obtained from Amersham, Cat* No. 
DA1006). The addition of this adaptor to a Hind lll site does not 
reconstruct the Hind lll site. The DNA was then digested with 
Smal and ligated in dilute solution followed by. tranforraation of 
s - GQli JM83. The correct plasmid, i.e., a plasmid lacking the 
restriction sites in the polylinker from the Hind lll site to the 
Smal site, was identified by digesting plasmid DNA isolated from 
transformants with EcoRI, Smal or Hind lll. A trans formant con- 
taining a plasmid that lacked the Hind lll site but contained the 
EcoRI site and Smal site was identified in this manner. This 
plasmid is pGS185. 

An EcoR I fragment containing the yeast MF 1 gene was 
purified by gel electrophoresis from the plasmid pCY17 as de- 
scribed by J, Kurjan & I. Herskowitz in Cell 30t933 (1982), spe- 
cifically incorporated herein by reference, and ligated into 
EcoRI cut pGS185. This ligation mixture was used to transform E. 
c °li HB101, selecting for ampicillin resistance. Plasmid DNA was 
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isolated from transfonnants and the presence of the correct in- 
sert confirmed by digests of the DNA with EcoRI. This is plasmid 
pGS285 and is depicted in Pig. 3. 

Plasmid pGS28S was digested to completion with Hindlll 
and religated under dilute conditions to eliminate three of the 
four internal Hind i I I sites in the MP 1 gene as noted by Kurjan & 
Herskowitz, ibid . The correct construct was selcted as described 
above. This is plasmid pGS385. 

The M13 AaBCD clone as described in Example 2 that 
carries nucleotide sequences encoding amino acids four through 
107 of the synthetic SLPI gene, was digested with Hindlll. This 
DNA was ligated with the following oligonucleotide adaptors 
5' GCT GAA GCT TCA GGT AAG 

CGA CTT CGA AGT CCA TTC TCGA* 
This adaptor had been formed by annealing the two oligonucleo- 
tides: 

5 1 GCT GAA GCT TCA GGT AAG and 5 ' AGC TCT TAC CTG AAG CTT CAGC 
first at 70°C for 2 % followed by slow cooling overnight. 

Following ligation of the adaptor to Hind lll cut M13 
AaBCD, the ligation mix was digested with Hind lll and Sail to re- 
lease a fragment purified by agarose gel electrophoresis and 
electrolution. This fragment was digested once more with Hin dlll 
and then ligated with pGS385 DNA that had been cut with Hindlll 
and Sail. E. coli HB101 was transformed with the ligation mix- 
ture and ampicillin resistant transformants were selected. 
Transfonnants containing plasmids with the correct insert DNA 
were identified by preparing plasmid DNA and digesting it with 
Hin dlll and Sai l. A plasmid constructed and isolated in this 
manner has been designated pGS485 and is depicted in Figure 4. 
This plasmid contains the MP 1 gene fused, in frame # to the syn- 
thetic SLPI gene at the Hin dlll site in the first spacer region 
of the MF 1 gene. Such constructs/ when placed in yeast, have 
been demonstrated to direct the synthesis, processing and secre- 
tion of the heterologous proteins as shown by A.J. Brake et al. 
in PNAS (USA) 81:4642, specifically incorporated herein by refer- 
ence. The fusion of the MF 1 gene and SLPI is contained on an 
EcoRI fragment in pGS485. This EcoR l fragment was cloned into 
the vector YIp5 as described in Example 8. 
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EXAMPLE 4 

Expression and purification of secretory leukocyte 
protease inhibitor (SLPI) using plasmid pSGE6. 

E. coli cells containing plasmid pSGE6 (SGE10 ceils) 
were cultured for 6 hours in 10 liters of M9 media with 2% 
tryptone, 0.5% yeast extract, 20g/l glucose, 200 mg/1 Vitamin B x 
and 100 mg/1 ampicillin added. IPTG was added to 0.2mM and the 
culture grown for another 6 hours. Ten liters of E. coli SGE10 
cells, at 8 grams per liter, were pelleted at 18,000 x g and 
resuspended in 50mM Tris.HCl (pH 7.5), 4mM EDTA buffer (hereinaf- 
ter. T50E4) and pelleted. The pellet was resuspended in 2.7 li- 
ters T50E4 and frozen in 150ml lots. Eight of these lots (equiv- 
alent to 36 gms of cells) were pooled and lysed by a single pass 
through a french press at 12,000 psi and 4° C. The lysate was 
centrifuged for 1.5hrs at 20,000 x g. One sixth of the pellet 
containing the cell insolubles (equivalent to six grams of cells) 
was washed twice with 125ml of T50E4 and the remaining material 
was frozen overnight. 

The frozen pellet was extracted with 25ml of lOOmM 
Tris.HCl (pH 8.0), 4mM EDTA (hereinafter T100E4) containing 20mM 
DTT (obtained from Sigma, Cat. No. D-0632), 4mM PMSP (obtained 
from Sigma, Cat. No. P-7626) and 8M urea (ultrapure, obtained 
from BRL, Cat. No. 5505UA) for lhr at 37° C. and centrifuged at 
10,000 x g for ten minutes. The resultant supernatant was mixed 
with 10ml packed Sephadex SP-C25 (obtained from Pharmacia) which 
had been pre- equilibrated with the extraction buffer T100E4 con- 
taining 20mM DTT and 8M urea and mixed on a roller for ten 
minutes at 37 °c to absorb the SLPI to the SP-Sephadex. 

The resin with the absorbed SLPI was pelleted by a ten 
minute centrifuge at 3,000 x g and the supernatant decanted. The 
remaining resin was washed twice with 25ml of T100E4 containing 
20mM DTT and 8M urea followed by two washes with 25ml T100E4 con- 
taining 20mM DTT. The resin was then extracted once with a mix- 
ture of 0.6ml 5 M NaCl and 25ml of T100E4 containing 20 mM DTT 
and 0.3 M NaCl . This extract contained about 0.15mg/ml protein 
and more than 0.04mg/ml SLPI. The SLPI obtained by this method 
was determined to be greater than 70% pure by high pressure 
liquid chromatography. 
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EXAMPLE 5 

Using the method of Example 4, a second frozen pellet 
was extracted with T100E4 containing 1% Triton x-100 (obtained 
from Sigma, Cat. No. T-6878) in place of the first 
T100E4/DTT/PMSF/urea wash. The resultant SLPI was slightly more 
pure than that obtained in Example 4 and gave higher activity in 
the refolding assay set forth in Example 6 below. 

EXAMPLE 6 
Refolding purified SLPI. 

About 40ug of partially-purified SLPI from Example 4 or 
5 was made 8M in urea or 5M in guanidine hydrochloride (obtained 
from Pierce Chemical Co., #24110), and 4mM in DTT and incubated 
for Ihr at room temperature. Oxidized glutathione (obtained from 
Sigma, Cat. No. G-4626) was added to 13.5mM and the mixture was 
again incubated for 1 hr at room temperatrue. The mixture was 
diluted 10-fold with a solution of 50mM Tris in NaOH, pH10.7 and 
incubated for a further 4 hrs at room temp. The mixture was then 
diluted 5-fold with 50mM Tris, pH8.0, and 0.15M NaCl and applied 
to a 1 x 2 cm column of Sepahdex SP-C25 preequilibrated with 50mM 
Tris, pH8.0 and 0.25M NaCl. The resin was washed with 50mM Tris, 
pH 8.0, containing 0.25M NaCl and then with 50mM Tris, pH8.0, 
containing 0.5M NaCl. The fraction eluting with the 0.5M salt 
wash was fully active and represented about 30% of the SLPI 
applied to the column. 

EXAMPLE 7 

Purification of SLPI from soluble and insoluble frac- 
tions of SGE30 cell lysate. 

Expression of the plasmid pSGE8 in E. coli SGE30 cells 
produced SLPI in both the soluble and insoluble fractions of the 
cell lysate. At 1% of the total cell protein, the SLPI was dis- 
tributed about 80% to the soluble and about 20% to the insoluble 
fractions . 

A. Purification of SLPI from the insoluble fraction 
* r ^ e E * coli SGE30 cells containing pSGE8 were grown in 
LB Media containing 50 ug/ml ampicillin in a shaker flask to an 
OD600 of 0.7 and induced by the addition of IPTG to 0.2mM. After 
three hours the cells are pelleted and were suspended in two 
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A. Purification of SLPI from the insoluble fraction 
The E. coli SGE30 cells containing pSGE8 were grown in 

LB Media containing 50 ug/ml ampicillin in a shaker flask to an 
OD600 of 0.7 and induced by the addition of IPTG to 0.2mM. After 
three hours the cells are pelleted and were suspended in two 
times their weight of 50mM Tris.HCl (pH 7.5) and 4mM EDTA (here- 
inafter T50E4). The cells were disrupted by sotiication at 4° C. 
and the extract was centrifuged for 20 minutes at 4* C. at 
12,000 x g. 

The pellet was washed in three volumes of T50E4 and was 
solublized at room temperature in a solution containing either 
10M urea or 6M guanidine hydrochloride, and 5mM reduced DTT. 
After a one hour incubation at room temperature, oxidized 
glutathione was added at a concentration of 17.5mM and the mix- 
ture was incubated for another hour. The mixture was then di- 
luted into 10 volumes of 50 mM Tris.HCl, pH 10*7. The diluted 
mixture was allowed to stand for 4 hours at room temperature fol- 
lowed by pH adjustment to 8 by the addition of 5 N HC1. This mix- 
ture was centrifuged to remove precipitated protein. 

The supernatant so produced contained SLPI which exhib- 
ited secretory leukocyte protease inhibitor activity. This pro- 
tein was purified by chromatography on a Sephadex SP-C25 column 
as described above. 

B. Purification of SLPI from soluble fraction 

E. coli SGE30 cells containing plasmid pSGE8 were grown 
in a shaker flask to an OD600 of 0.7 and induced by the addition 
of IPTG to 0.2mM. At an OD600 of 1.1, the cells were pelleted at 
25,000 x g for 15 minutes. The pellet was resuspended in T50E4 
and was lysed by two passages through a french press at 20,000 
psi at 4" C. The lysate was centrifuged at 25,000 x g for 15 
minutes . 

The supernatant was made 25mM in DTT. This mixture was 
incubated at 0° C. for one hour and sufficient HC1 was added to 
reach a final concentration of 5%. After a 30 minute incubation 
at 0° C, the mixture was centrifuged at 25,000 x g for 15 
minutes and the supernatant removed for further processing. The 
pH of the supernatant was adjusted to 8.0 with 10M NaOH and 
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EXAMPLE 8 

The EcoRI fragment containing the fused SLPI-MF 1 gene 
(see Ex* 3.E-) was ligated to the EcoR I site of the yeast vector 
YIp5 as described by D. Botstein and R.W. Davis in The Molecular 
Biology of the Yeast Saccharomyces , Cold Spring Harbor Laborato- 
ry, pp* 607-636 (1982), specifically incorporated herein by ref- 
erence, to generate YlpSLPI-1 and has been integrated into the 
URA3 gene of S. cerevisiae BS214 (MAT , Ura3-52, pep4, prbl ) by 
site-directed recombination as described by T. Orr-Weaver et al . 
in Methods in Enzyraology 101 :228 (1983), specifically incorpo- 
rated herein by reference. This strain, S. cerevisiae SGY-1, se- 
cretes fully active SLPI into the culture supernate. 

A second strain, SGY-3, also produces and secretes 
active SLPI. This strain carries the MP 1::SLPI fusion on the 
replicating yeast plasmid pGS585. This plasmid was constructed 
from pJDB207 as described by J.R. Broach in Methods in Enzymology 
101 : 307 (1983), specifically incorporated herein by reference, by 
the addition of the yeast URA 3 gene, isolated from the plasmid 
YEp24 as described by D. Botstein and R.W. Davis in The Molecular 
Biology of the Yeast Saccharomyces , Cold Spring Harbor Laborato- 
ry, pp. 607-636, specifically incorporated herein by reference, 
an& cloned into the Hindlll site of pJDB207 to construct pGS585. 
The MF 1::SLPI fusion gene, contained on an EcoR I fragment, was 
cloned into the Sai l site of pGS585 using EcoRI-XhoI adaptors 
(obtained from Amersham, Cat. No. DA1007) to generate YEpSLPI-1. 
This plasmid was introduced into S. cerevisiae DBY746 (MAT , 
Ura3-52, leu2-3, his3 1, trp 1-289) by transformation as de- 
scribed by Ito et al. in J. Bacteriology 153 :163 (1983), specifi- 
cally incorporated herein by reference. 

Saccharomyces cerevisiae strains SGY-1 and SGY-3 were 
grown at 30 °C to stationary phase in SD medium lacking uracil 
according to the method of P. Sherman et al . described in Methods 
in Yeast Genetics , p. 62, Cold Spring Harbor Laboratories, Cold 
Spring Harbor, New York (1981), specifically incorporated herein 
by reference. Cells were removed from the culture medium by 
centrifugation and the culture supernatant was assayed for SLPI 
activity by measuring (1) protease inhibitory activity and (2) 
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the amount of material that specifically reacts with anti-SLFI 
antibodies by an enzyme-linked immunoassay. Purification schemes 
may be developed in a manner analogous to prior methods described 
herein. 

It will be apparent to those skilled in the art that 
various modifications and variations can be made to the processes 
and products of the present invention* Thus, it is intended that 
the present invention cover the modifications and variations of 
this invention provided they come within the scope of the 
appended claims and their equivalents. 
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WHAT IS CLAIMED IS ; 

1. A method for the recombinant DNA synthesis of a 
single polypeptide chain serine protease inhibitor having at 
least one active site possessing serine protease inhibitor activ- 
ity comprising: 

(a) Preparation of a DNA sequence capable of directing 
a host microorganism to produce a protein having serine pro- 
tease inhibitor activity, said inhibitor exhibiting substan- 
tial homology to the native serine protease inhibitor iso- 
lated from parotid secretions; 

(b) Cloning the DNA sequence into a vector capable of 
being transferred into and replicating in a host microorga- 
nism, such vector containing operational elements for the 
DNA sequence; 

(c) Transferring the vector containing the DNA se- 
quence and operational elements into a host microorganism 
capable of expressing the protease inhibiting protein; 

(d) Culturing the host microorganism under conditions 
appropriate for amplification of the vector and expression 
of the inhibitor*; 

(e) Harvesting the inhibitor; and 

(f) Permitting the inhibitor to assume an active ter- 
tiary structure whereby it possesses serine protease inhib- 
itor activity. 

2 . The method of claim 1 wherein said DNA sequence is 
a synthetic sequence . 

3. The method of claim 1 wherein said DNA sequence is 
a natural DNA sequence* 

4. The method of claim 1 wherein said method further 
comprises purification of the inhibitor. 

5 . The method of claim 4 wherein said purification is 
conducted prior to permitting said inhibitor to assume an active 
form. 

6. The method of claim 4 wherein said purification is 
conducted subsequent to permitting said inhibitor to assume an 
active form. 
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7. The method of claim 1 wherein said vector is con- 
structed from parts of the vectors selected from the group con- 
sisting of pBR322 and pIQ. 

8. The method of claim 1 wherein said host microorga 
nism is selected from the group consisting of microorganisms of 
the genera Escherichia , Bacillus and Saccharomyces « 

9. The method of claim 8 wherein said host microorga- 
nism is Escherichia coli . 

10. The method of claim 8 wherein said host microorga- 
nism is Bacillus subtil is . 

11. The method of claim 8 wherein said host microorga- 
nism is Saccharomyces cerevisiae . 

12. The method of claim 3 wherein said natural DNA se- 
quence is obtained by the method comprising: 

(a) Preparation of a human cDNA library from cells ca- 
pable of generating a serine protease inhibitor; 

(b) Probing the human DNA library with at least one 
probe capable of binding to the protease inhibitor gene or 
its protein product; 

(c) Identification of at least one clone containing 
the gene coding for the inhibitor by virtue of the ability 
of the clone to bind at least one probe for the gene or its 
protein product; 

(d) Isolation of the gene coding for the inhibitor 
from the clone or clones identified; and 

(e) Linking the gene> or suitable fragments thereof, 
to operational elements necessary to maintain and express 
the gene in a host microorganism. 

13. The method of claim 12 wherein said cells are 
human parotid cells. 
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14. The method of claim 3 wherein said natural DNA se- 
quence is obtained by the method comprising: 

(a) Preparation of a human genomic DNA library; 

(b) Probing the human genomic DNA library with at 
least one probe capable of binding to a serine protein in- 
hibitor gene or its protein product; 

(c) Identification of at least one clone containing 
the gene coding for the inhibitor by virtue of the ability 
of the clone to bind at least one probe for the gene or its 
protein product; 

(d) Isolation of the gene coding for the inhibitor 
from the clone or clones identified; and 

(e) Linking the gene, or suitable fragments thereof, 
to operational elements necessary to maintain and express 
the gene in a host microorganism. 

15. A synthetic DNA sequence capable of directing mi- 
crobial synthesis of a single polypeptide chain serine protease 
inhibitor haying at least one active site possessing serine pro- 
tease inhibitor activity, said protein exhibiting substantial 
homology to the native serine protease exhibitor isolated from 
parotid secretions . 

16. The DNA sequence of claim 15 wherein said sequence 



selected 


from the 


group 


consisting 


f of (1) 






AGCGG 


TAAAA 


GCTTC 


AAAGC 


TGGCG 


TATGC 


CCGCC 


GAAAA 


AATCC 


GCGCA 


GTGTC 


TGCGG 


TACAA 


AAAAC 


CGGAA 


TGCCA 


GTCCG 


ACTGG 


CAGTG 


CCCGG 


GTAAA 


AAACG 


TTGTT 


GCCCG 


GACAC 


CTGCG 


GCATC 


AAATG 


CCTGG 


ATCCG 


GTTGA 


TACCC 


CGAAC 


CCGAC 


TCGTC 


GAAAA 


CCGGG 


TAAAT 


GCCCG 


GTAAC 


CTATG 


GCCAG 


TGTCT 


GATGC 


TGAAC 


CCGCC 


GAACT 


TCTGC 


GAAAT 


GGACG 


GCCAG 


TGTAA 


AC GAG 


ATCTG 


AAATG 


CTGTA 


TGGGT 


ATGTG 


CGGCA 


AATCT 


TGTGT 


TTCCC 


CGGTA 



AAAGC ATAA 3 1 ; 

or (2) genetically equivalent sequences coding for production of 
the same protein. 

17. A translational coupler having the nucleotide sequence 
T AAC GAGGC GC AAAAAAT GAAAAAGAC AGC T ATC GC GATC AAGG AGAAAT AAAT G . 
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